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ABSTRACT 


With  the  growth  of  information  dissemination  over  digital  communication 
networks,  much  research  has  been  devoted  to  compressing  digital  image  information  for 
efficient  transmission.  The  ability  to  adjust  the  desired  resolution  of  an  image  as  the 
available  bandwidth  on  the  network  changes  allows  the  user  to  control  the  flow  of  data 
according  to  the  resources  available.  In  this  thesis  we  integrate  multiresolution  image 
compression  methods  with  image  recognition  techniques  to  assist  in  automatic  image 
recognition  at  several  levels  of  resolution.  Features  of  grayscale  and  binary  images  of  text 
characters  and  aircraft  line  drawings  are  described  using  wavelet  transform  coefficients, 
wavelet  transform  subband  energy,  and  Fourier  transform  coefficients.  Transmission  of 
these  features  over  a  digital  conununication  link  is  simulated,  and  multiresolution 
recognition  performance  in  the  presence  of  channel  noise  is  presented. 
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1.  INTRODUCTION 


The  growth  of  information  exchange  over  digital  communication  networks  has 
made  image  compression  one  of  the  most  important  research  topics  in  recent  years.  The 
ability  to  adjust  the  desired  resolution  of  an  image  as  the  available  bandwidth  on  the 
network  changes  allows  the  user  to  control  the  flow  of  data  according  to  the  resources 
available.  In  this  thesis  a  scheme  for  integrating  multiresolution  image  compression  with 
automatic  image  recognition  techniques  is  presented.  It  is  shown  that  this  method  can 
achieve  fast  and  accurate  image  recognition  at  varying  resolution  levels. 

Figure  1  shows  a  model  of  a  digital  communications  link  which  shall  be  used  in  the 
rest  of  the  thesis.  The  available  bandwidth  on  the  channel  varies  with  the  traffic  on  the 


FIGURE  1.  Multiresolution  Transmission  Scheme,  After  Ref  9,  p.  33 


channel.  The  user  selects  one  of  five  resolution  levels  for  the  image,  adjusting  the  desired 
resolution  as  traffic  varies.  Normally,  the  user  selects  the  lowest  resolution  level  to  mini¬ 
mize  the  load  on  the  channel.  One  of  the  important  features  in  this  scheme  is  the  ability  to 
recognize  images  at  a  variety  of  resolution  levels.  This  allows  the  user  to  use  low  resolu¬ 
tion  images  to  identify  images  of  particular  interest  and  then  select  higher  levels  of  resolu¬ 
tion  only  for  these  images,  thus  maximizing  the  utility  of  the  available  bandwidth.  A 
typical  scenario  in  which  this  model  is  applicable  is  transmission  from  a  remote  sensor 
where  the  processing  power  and  memory  storage  available  onboard  is  insufficient  to  per- 
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form  image  recognition.  Another  example  is  searching  an  image  archive  at  a  remote  site. 

In  this  thesis,  multiresolution  image  compression  with  the  wavelet  transform  is 
integrated  with  image  recognition  algorithms  to  perform  multiresolution  image  recognition 
of  text  characters.  The  processing  overhead  due  to  image  registration  required  to  perform 
recognition  on  images  which  have  undergone  linear  translation  can  be  reduced  by  using 
Fourier  transform  coefficients  as  elements  of  the  feature  vectors.  The  memory  and  the 
bandwidth  required  to  perform  recognition  are  reduced  by  using  the  energy  in  each  subband 
of  the  wavelet  transform  as  elements  of  the  feature  vectors. 

Chapter  n  introduces  the  fundamentals  of  image  recognition  and  shows  the 
importance  of  the  feature  selection  problem  in  designing  a  recognition  system.  Chapter  in 
provides  background  for  multiresolution  signal  decomposition  with  the  wavelet  transform. 
Chapter  IV  presents  results  of  the  proposed  multiresolution  image  recognition  algorithm 
using  text  characters.  Chapter  V  summarizes  the  results  and  suggests  areas  of  future 
research.  Computer  code  used  in  this  thesis  is  presented  in  the  Appendix. 
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II.  IMAGE  RECOGNITION 


This  chapter  discusses  the  digital  image  representation  and  recognition  scheme 
which  is  the  backbone  of  the  work  presented  in  this  thesis.  The  scheme  can  be  described  in 
terms  of  a  system  of  image  processing  functions  which  must  be  performed  to  convert  an 
image  of  a  real-world  object  into  a  form  which  a  computer  can  analyze  and  classify  as 
shown  in  Figure  2.  We  shall  use  scanned  images  of  text  characters  to  test  our  recognition 
algorithm,  but  the  discussion  in  this  chapter  holds  for  any  black  and  white  image. 


FIGURE  2.  Image  Recognition  System  Diagram  After  Ref  1,  p.  8 


The  scheme  begins  with  an  image  sensor  which  captures  the  image.  If  the  sensor  is 
an  analog  device,  such  as  a  film  camera,  the  image  must  be  digitized  and  converted  to  a 
grayscale  image,  often  by  using  a  scanner.  For  each  small  area  of  the  image,  known  as  a 
picture  element  or  pixel,  the  scanner  converts  the  analog  image  to  a  number  which 
represents  the  relative  intensity  of  light  in  that  pixel.  A  typical  scanner  may  divide  the 
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intensity  spectrum  between  black  (zero  brightness)  and  white  (maximum  brightness)  into 
256  or  more  distinct  shades  of  gray.  This  is  known  as  a  grayscale  image. 

The  grayscale  image  is  then  passed  to  a  segmentation  routine,  which  separates 
objects  in  the  image  by  applying  a  threshold  to  the  image,  using  the  fact  that  distinguishable 
objects’  pixel  values  differ  markedly  from  the  pixel  values  of  the  surrounding  environment. 
In  Figure  2,  the  objects  are  the  individual  letters  in  the  word  “DOG.”  The  objects  are  then 
input  to  a  feature  extraction  routine,  which  distills  important  information  from  the  image, 
called  features.  The  individual  features  a^,  02,  etc.,  are  combined  together  into  an  n- 

dimensional  feature  vector  [ai,  a2,...OLj^,  which  is  then  passed  to  a  classification  routine. 
The  choice  of  features  determines  the  length  of  the  feature  vector.  In  Chapter  IV,  we  shall 
use  feature  vectors  which  vary  from  9  to  over  16,000  elements.  The  classification  routine 
compares  the  feature  vector  obtained  from  the  image  currently  being  classified,  known  as 
the  test  image,  with  the  feature  vectors  obtained  previously  from  reference  images.  The 
classifier  associates  the  test  image  with  one  of  the  reference  images,  and  outputs  the 
decision  thus  made. 

The  performance  of  an  image  recognition  system  is  expressed  using  the  percentage 
of  correctly  classified  images.  For  a  character  recognition  system,  this  is  the  rate  at  which 
the  classifier  outputs  a  “D”  when  a  “D”  was  input,  for  example;  if  the  classifier  declares 
that  the  image  was  anything  other  than  a  “D,”  an  error  has  been  made.  There  are  many 
possible  sources  of  errors.  To  illustrate  this,  a  detailed  discussion  of  each  image  processing 
function  is  required. 

A.  IMAGE  ACQUISITION 

The  two  elements  necessary  to  create  an  image  of  a  real-world  object  in  a  form 
readable  by  a  computer  are  a  physical  device  that  is  sensitive  to  electromagnetic  energy, 
such  as  a  camera  or  charge-coupled  device,  and  a  digitizer,  which  converts  the  information 
into  a  discrete  set  of  numbers  [Ref  1,  p.  10].  Consider  a  flatbed  image  scanner,  for  example, 
which  was  used  to  create  the  reference  images  in  this  thesis.  A  scanner  sensor,  like  many 
photocopier  sensors,  consists  of  a  line  of  silicon  imaging  elements,  called  photosites,  which 
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produce  an  electrical  voltage  output  proportional  to  the  intensity  of  the  light  reflected  from 
the  original  image  as  it  is  scanned  [Ref  1,  p.  12],  The  sensor  outputs  a  set  of  voltages 
corresponding  to  the  distribution  of  brightness  in  the  original  image. 

The  set  of  individual  voltages  is  formed  into  an  NxM  array,  each  element  of  which 
is  a  pixel.  This  array  is  then  fed  to  a  digitizer,  which  converts  the  voltage  levels  to  a  discrete 
set  of  numbers  corresponding  to  the  voltage  in  a  particular  area  of  the  image.  These 
numbers,  called  the  grayscale  values,  are  limited  to  a  particular  range  [0,  J].  Generally,  J  is 
a  power  of  2.  These  three  parameters,  N,  M,  and  J,  determine  the  two  factors  which  govern 
the  resolution  (degree  of  discernible  detail)  of  the  picture  obtained  from  the  digitization 
process;  spatial  resolution  and  gray  level  quantization. 

Note  that  there  is  a  tradeoff  between  resolution  and  the  memory  required  to  store 
the  digital  image:  the  finer  the  resolution,  the  greater  the  storage  requirement.  For  example, 
an  image  comprised  of  256  x  256  pixels  (65,536  pixels  total)  quantized  into  128  gray  levels 
(requiring  at  least  7  bits  per  pixel)  takes  up  approximately  64  kilobytes  of  memory.  The 
same  image  using  1024  x  1024  pixels  and  256  gray  levels  requires  over  1  megabyte  of 
storage  [Ref  1,  p.  33].  In  addition  to  the  added  burden  on  memory  resources,  more  pixels 
means  longer  processing  times  at  each  stage  of  the  image  recognition  system,  at  least  until 
feature  extraction.  Determining  the  proper  tradeoff  between  resolution  and  storage 
requirements  is  one  of  the  fundamental  design  decisions  facing  the  image  processing 
system  designer.  For  the  purpose  of  image  recognition,  the  desired  resolution  is  the 
minimum  required  to  extract  feature  vectors  sufficiently  distinct  to  provide  accurate 
classification.  If  the  original  image  does  not  have  sufficient  resolution,  there  are  numerous 
ways  to  enhance  the  image  by  additional  processing. 

B.  PREPROCESSING 

These  methods,  which  fall  into  the  pre-processing  block  in  the  system  diagram, 
include  filtering,  edge  enhancement,  and  contrast  enhancement.  Each  of  these  is  designed 
to  amplify  crucial  details  present  in  the  image  while  suppressing  noise  and  undesired  image 
elements. 
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Filtering  can  be  done  in  either  the  spatial  or  the  frequency  domain.  In  the  spatial 
domain,  filtering  amounts  to  convolving  the  image  with  a  window  chosen  for  its  spectral 
properties.  Examples  are  shown  in  Figure  3. 
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FIGURE  3.  Examples  of  Spatial  Filters.  After  Ref  1,  p.  195-200 


The  low  pass  filter  in  Figure  3  computes  the  average  of  pixels  in  a  3x3 
neighborhood  around  a  pixel  in  the  center.  This  suppresses  noise,  but  blurs  sharp  edges. 
The  high-pass  filter  enhances  edges,  but  magnifies  noise.  The  Sobel  filter  computes  the 
gradient  in  the  y  direction  at  the  center  pixel.  When  this  quantity  is  added  to  (subtracted 
from)  the  original  pixel  value,  contrast  between  the  value  of  the  center  pixel  is  increased 
(decreased).  If  the  pixels  in  this  3x3  neighborhood  are  uniform,  this  operation  has  no  effect. 
However,  if  the  neighborhood  contains  a  horizontal  edge,  the  value  of  the  center  pixel  will 
be  changed  considerably,  thus  increasing  the  contrast  between  this  pixel  and  background 
pixels.  The  Sobel  vertical  filter  is  the  transpose  of  the  horizontal  filter. 

Filtering  in  the  frequency  domain  is  performed  by  computing  a  two-dimensional 
transformation  of  the  image,  such  as  the  Fourier  transform,  and  then  assigning  various 
weights  to  the  components  in  the  transform  domain  to  accomplish  a  desirable  enhancement. 
Just  as  in  the  spatial  domain,  there  is  a  tradeoff  between  reducing  noise  and  reducing  the 
contrast  in  the  vicinity  of  edges.  Weighting  low-pass  frequencies  relatively  more  than  high 
pass  frequencies  reduces  noise,  but  blurs  the  sharp  edges  m  the  image.  Conversely, 
weighting  higher  frequencies  more  than  lower  frequencies  enhances  edges  at  the  cost  of 
magnifying  noise.  Finding  the  optimal  mix  of  enhancements  for  a  given  image  processing 
application  must  be  performed  on  a  case  by  case  basis. 
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C.  SEGMENTATION 


After  the  image  has  been  enhanced,  the  next  step  is  to  distinguish  between  objects 
and  surrounding  background  and  to  differentiate  among  objects.  The  most  basic 
segmentation  method  is  grayscale  thresholding.  The  segmentation  algorithm  converts  the 
grayscale  image  to  a  binary  image,  where  every  pixel  below  the  threshold  is  assigned  a 
value  of  0,  and  every  pixel  above  the  threshold  is  assigned  a  1.  If  one  has  no  prior 
knowledge  of  the  type  of  image  to  be  processed,  one  technique  of  selecting  a  threshold  is 
histogram  partitioning,  as  shown  in  Figure  4. 
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FIGURE  4.  Thresholding  Using  Image  Histogram 


In  this  example,  we  see  an  image  of  a  text  character  in  Figure  4a.  A  histogram  of  its 
pixel  values  is  shown  in  Figure  4b.  There  is  a  clear  break  between  one  cluster  of  pixel 
values  and  the  other,  making  the  choice  of  the  threshold  at  T  obvious.  Unfortunately,  for 
many  images,  such  as  the  text  character  shown  in  Figure  4c,  the  choice  is  often  not  as  clear 
due  to  noise  in  the  scanning  process  or  the  nature  of  the  object  and  background.  This 
character’s  histogram  is  shown  in  Figure  4d.  Note  that  there  is  no  clear  threshold  for  this 
poorly  processed  image.  Nevertheless,  for  the  purpose  of  character  recognition,  the  choice 
of  the  threshold  for  a  particular  image  is  not  as  important  as  consistency  between 
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thresholding  the  reference  image  and  the  test  images.  So  long  as  both  the  reference  image 
and  the  test  image  use  a  similar  threshold,  the  feature  vectors  extracted  from  each  should 
be  similar. 

D.  FEATURE  EXTRACTION  AND  IMAGE  REGISTRATION 

After  segmentation,  the  binary  image  is  passed  to  a  feature  extraction  routine. 
Image  features  are  quantities  which  carry  information  about  the  object  in  the  image,  such 
as  size,  texture,  color,  etc.  They  may  also  contain  transform  domain  information  such  as  the 
energy  in  a  particular  band  of  frequencies.  Images  may  be  formed  into  classes  based  on  the 
similarity  of  their  feature  vectors.  The  choice  of  features  which  efficiently  and  accurately 
describe  the  various  classes  of  images  is  known  as  the  feature  selection  problem.  In  general, 
spatial  domain  features  are  sensitive  to  variations  in  image  translation,  rotation,  and  scale 
changes  [Ref  1,  p.  501].  If  one  uses  spatial  domain  features,  one  must  compensate  for 
spatial  variances  by  registering  the  image. 

Registration  is  the  process  by  which  one  corrects  for  relative  translational  shifts, 
rotational  shifts,  and  resolution  differences  from  one  image  to  another  [Ref  2,  p.  562]. 
Image  translation  can  be  compensated  for  by  calculating  the  correlation  between  the 
reference  image  Xj.(m,n)  and  a  test  image  Xt(m,n)  for  aU  possible  shifts  of  the  object  within 
the  image.  In  its  simplest  form,  this  measure  is  defined  as 


R{m,n)  - 


M  N 

I  I  X^(m,n)X^(m-j,n-k) 

m=  ln=  1 _ 


r  M  N 


I  I  n) 

Un  -  In  s:  \ 


r  M  N 


m  ly 

I  =  l/J  =  1 


r 

|2 


(EQl) 


where  (m,n)  are  pixel  positions  in  an  MxN  image.  There  are  several  problems  with  this 
method,  however.  First,  the  correlation  may  be  relatively  broad,  i.e.,  having  no  single 
sharp  peak  in  the  correlation  function.  Second,  noise  may  mask  the  true  peak.  Finally,  reg¬ 
istration  is  computationally  expensive,  especially  if  the  relative  motion  between  images  is 
significant  [Ref  2,  p.  566]. 
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Another  solution  to  the  registration  problem  is  to  find  an  alternate  representation  of 
the  image  which  is  invariant  to  translation,  rotation,  contraction,  or  reduction.  One  example 
of  this  is  to  take  the  Fourier  transform  of  the  image  and  use  the  Fourier  coefficients  to  create 
feature  vectors.  The  magnitude  of  Fourier  transform  coefficients  in  a  rectangular 
coordinate  system  are  invariant  to  linear  translation,  i.e.,  the  coefficients  change  due  to 
translation  of  the  object  in  the  spatial  domain  by  changing  in  phase,  but  not  in  magnitude 
[Ref  1,  p.  95].  We  will  use  this  property  to  our  advantage  in  Chapter  4  by  taking  the  Fourier 
transform  of  wavelet  transform  coefficients  to  obtain  multiresolution  translation  invariant 
recognition.  However,  Fourier  coefficients  in  a  rectangular  coordinate  system  are  not 
invariant  to  translations  in  rotation  [Ref  1,  p.  99].  Rotating  an  image  by  an  angle  cp  rotates 
the  Fourier  transform  by  the  same  angle.  If  one  first  converts  the  image  to  polar 
coordinates,  one  can  obtain  a  polar  Fourier  transform,  for  which  the  magnitude  of  the 
coefficients  is  invariant  to  rotation.  Unfortunately,  one  loses  linear  translation  invariance 
in  the  process.  Since  the  focus  of  this  thesis  is  on  the  recognition  of  text  characters,  which 
are  generally  not  subject  to  rotational  changes,  we  shall  not  address  rotational  variance 
further. 

Once  the  feature  vector  has  been  obtained,  it  is  compared  with  feature  vectors  from 
various  reference  images  which  represent  the  different  classes  of  the  image.  The  feature 
vectors  of  the  reference  images  occupy  an  n-dimensional  Euclidean  space,  called  the 
feature  space,  as  shown  in  Figure  5.  Each  reference  feature  vector  rj  represents  a  point  in 
the  feature  space.  One  logical  and  simple  way  to  classify  test  images  is  to  compute  the 
Euclidean  distance  between  each  of  the  reference  feature  vectors  rj  and  the  test  feature 
vector  t: 


dist{r^,t)  =  {r^-t)  .  (EQ 2) 

The  test  image  is  then  associated  with  the  reference  image  which  has  the  smallest  distance 
measure.  If  the  net  effect  of  all  the  various  forms  of  image  noise  can  be  modeled  as  addi¬ 
tive  white  Gaussian  noise,  it  can  be  shown  that  the  minimum  distance  classifier  is  opti¬ 
mum  [Ref  1,  p.  581]. 
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FIGURE  5.  Feature  Vectors  And  Euclidian  Distance  in  Feature  Space 

Another  possible  classification  method  is  a  neural  network  based  approach,  which 
has  great  utility  when  the  statistical  properties  of  the  pattern  classes  are  unknown.  A 
multilayer  network  is  trained  with  the  reference  images  using  a  learning  technique  such  as 
the  backpropagation  algorithm  [Ref  1,  pp.  595-602],  If  the  network  has  learned  the  proper 
classification  for  each  reference  image,  it  should  be  able  to  correctly  classify  a  test  image. 
Unfortimately,  analyzing  the  performance  due  solely  to  the  influence  of  the  selection  of 
image  features  is  difficult.  In  this  thesis,  we  shall  use  the  minimum  distance  classifier. 

We  have  seen  that  the  performance  of  an  image  recognition  system  depends  on  a 
chain  of  processes,  each  one  of  which  is  essential  to  the  proper  functioning  of  the  system. 
Feature  selection  is  an  important  part  of  the  recognition  system.  The  wavelet  transform, 
which  supports  a  multiresolution  approach  to  the  feature  selection  problem,  is  presented  in 
the  next  chapter. 
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III.  MULTIRESOLUTION  IMAGE  COMPRESSION  AND  THE 

WAVELET  TRANSFORM 


Multiresolution  signal  decomposition  can  be  used  as  a  form  of  flow  control  in  a 
packet  switched  network  by  allowing  a  user  to  adjust  the  desired  resolution  of  an  image 
based  on  the  available  bandwidth  in  a  communications  channel  [Ref  9,  p.  1].  We  will  show 
how  the  wavelet  transform  allows  us  to  perform  multiresolution  image  compression  and 
show  how  this  can  be  integrated  with  the  algorithms  from  Chapter  II  to  recognize  images 
at  different  levels  of  resolution. 

A.  MULTIRESOLUTION  IMAGE  COMPRESSION  AND  RECOGNITION 


Figure  6  shows  the  model  of  a  multiresolution  image  compression  and  recognition 
scheme  over  a  digital  transmission  channel  (packet  switched  network)  used  in  this  thesis. 


FIGURE  6.  Multiresolution  Transmission  Scheme,  After  Ref  9,  p.  33 


In  this  scheme,  the  user  desires  to  transmit  images  of  unknown  content  from  a 
remote  site  as  quickly  as  possible.  This  is  often  a  problem  when  searching  an  image 
archive.  The  available  bandwidth  in  a  packet  switched  network  varies  with  the  requested 
load  on  the  channel.  Multiresolution  image  coding  allows  the  user  to  select  the  desired 
amount  of  detail  in  an  image  to  be  transmitted  depending  on  the  current  load  as  measured 
by  the  packet  delay  through  the  channel.  If  the  channel  is  congested,  the  user  wiU  want  to 
transmit  images  at  the  lowest  level  of  resolution  from  which  they  can  be  recognized.  If  the 
user  decides  that  the  image  is  of  particular  interest,  it  can  be  enhanced  by  adding  additional 
detail  with  further  transmissions  [Ref  9,  p.  26]. 
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To  determine  rapidly  and  accurately  which  images  are  of  interest,  the  user  requires 
a  way  to  compare  images  at  different  levels  of  resolution  as  shown  in  Figure  7.  As  stated 
in  Chapter  H,  registering  images  at  different  levels  of  resolution  is  computationally 
expensive,  so  we  require  a  way  to  compare  images  without  first  registering  them. 


50 


150 


350 

7a 

FIGURE  7.  Two  Images  of  an  Object  at  Different  Scales 

The  solution  is  to  use  the  orthogonal  projection  of  the  image  in  Figure  7b  onto  the 
space  spanned  by  the  image  in  Figure  7a.  We  obtain  the  orthogonal  projection  simply  by 
talcing  the  wavelet  transform  of  the  image  in  Figure  7b  and  discarding  the  smallest  scale 
wavelet  transform  coefficients,  then  performing  the  inverse  wavelet  transform  [Ref  11,  p. 
314-320].  Alternatively,  we  can  perform  image  recognition  in  the  wavelet  domain,  using 
wavelet  coefficients  as  image  features  in  our  recognition  scheme.  Before  showing  how  this 
can  be  done,  we  first  present  a  brief  overview  of  the  wavelet  transform. 

B.  ORTHOGONAL  BASIS  FUNCTION  EXPANSIONS 

We  desire  to  decompose  a  signal  x(t)  using  elementary  functions  <j)i(t),  <1)2(0, -  so 

that  we  may  discern  information  present  in  the  signal  which  may  not  have  been  obvious  in 
its  original  form.  If  every  function  contained  in  a  vector  space  V  can  be  written  as  a  linear 
combinations  of  linearly  independent  vectors  (|)]j(t)  which  span  V,  i.e. 
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(EQ3) 


^(0  =  |;c^<t)^(r). 

then  the  set  (^i(t),  (j)2(t),...  forms  a  basis  for  V  [Ref  5,  p.  78],  The  c^’s  are  the  transform 
coefficients,  i.e.,  they  are  the  projection  of  the  function  x(t)  onto  the  basis  function  (jj^Ct).  Jf 
the  basis  functions  are  orthogonal,  we  can  compute  this  projection  by  taking  the  inner 
product  of  the  signal  x(t)  with  the  basis  function  (|)i,(t)  [Ref  10,  p.  245].  We  define  the  inner 
product  operation  for  continuous  signals  as 


t 

and  for  a  discrete  signal  of  length  N, 


«  =  1 

The  set  (j)i(n),  (j)2(n),...  is  said  to  be  orthonormal  if 

where  Sjj  is  the  Kronecker  delta  function. 

One  well-known  form  of  signal  decomposition  which  has  these  properhes  is  the 
discrete  Fourier  transfoim,  in  which  the  ^n)  are  compiett  exponentials  at  a  single 

frequency.  Each  transform  coefficient,  therefore,  is  simply  the  signal  content  at  the 

frequency  of  the  exponential.  Any  discrete  signal  can  be  represented  as  a  weighted  sum  of 
these  basis  functions,  i.e.. 


(EQ4) 


(EQ5) 


(EQ6) 


N  j2nkn 

/:=  1 

where  each  c^  is  the  coefficient  at  the  k-th  frequency, 


(EQ7) 


N  jtnkn 

Ck=  '£x(n)e  ^  . 

k  =  l 


(EQ8) 


This  is  equivalent  to  a  filter  bank  formation  as  shown  in  Figure  8a  [Ref  8,  p 


p.  800]. 
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(a) 


(b) 

FIGURE  8.  Fourier  Transform  as  Ideal  Orthogonal  FOter  Bank  Tiling  the  Frequency  Axis 

Each  basis  function  in  the  Fourier  transform  corresponds  to  a  non-overlapping  bandpass 
filter.  The  ou^ut  of  each  of  these  filters  is  the  content  of  the  signal  x(n)  in  each  fi*equency 
band.  The  division  of  the  frequency  axis  into  contiguous  nonoverlapping  sections  is 
known  as  tiling,  as  shown  in  Figure  8b. 

Frequency  is  not  the  only  important  characteristic  of  a  signal.  An  equally  important 
feature,  particularly  for  images,  is  scale.  Scale  is  the  amount  of  time  or  space  over  which  a 
particular  signal  component  is  significant.  Note  that  while  the  Fourier  coefficients  locate 
signal  energy  well  in  frequency,  the  time  or  space  resolution  is  poor.  Wavelet  analysis  does 
not  suffer  this  limitation  and  provides  a  way  to  localize  signals  in  both  scale  and  space. 
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C.  1-D  WAVELET  TRANSFORM 


We  start  with  the  multiresolution  formulation  of  Mallat  for  one-dimensional  signals 
[Ref  6,  p.  6].  Any  signal  can  be  decomposed  into  low-pass  and  high-pass  signals.  Formally, 
a  vector  space  Vj+i  is  comprised  of  two  spaces:  Vj,  which  contains  all  low-pass  functions, 
and  Wj,  which  contains  all  high-pass  functions,  or 

VjeWj=Vj^,.  (EQ9) 

The  spaces  Vj  and  Wj  are  called  orthogonal  complements  of  each  other  because  they  are 
nonoverlapping,  and  they  combine  to  span  One  can  decompose  a  signal  x(n)  which 
is  comprised  of  frequencies  in  the  range  [0,  n]  by  one  set  of  basis  functions  which  spans 
the  range  Vj  =  [0,  n/2]  and  another  set  which  spans  the  range  Wj  =  [ii/2,  n],  as  shown  in 
Figure  9a.  If  we  extend  this  idea  by  successively  dividing  the  range  [0,71/2]  into  smaller 
and  smaller  subspaces,  as  shown  in  Figure  9b,  we  see  that  any  subspace  Vj  contains  an 
infinite  number  of  subspaces  Vj.^,  Vj.2,  etc.  Each  of  these  spaces  also  has  a  complemen¬ 
tary  space  Wj.i,  Wj.2.  The  space  which  contains  aU  square  summable  functions,  known  as 
L^,  contains  aU  subspaces  Vj.  This  relation  is  summarized  below: 

V.cy.  cy.  -cy  =  (EQ 10) 

Mallat  showed  that  the  spaces  Vj,  Vj.^,  etc.  are  spanned  by  dilations  and  integer 
translations  of  a  single  scaling  function  (|>j(n)  (a  low-pass  function),  and  the  spaces  Wj  are 
similarly  spanned  by  dilations  and  integer  translations  of  \yj(n)  (a  high-  pass  function)  [Ref 
6,  p.  6].  Because  Vj.^  and  Wj.^  are  subspaces  contained  wholly  within  Vj,  (|)j(n)  and  Vj(n) 
at  scale  j  can  be  expressed  in  terms  of  a  filter  and  the  functions  themselves  at  scale  j-1.  If 
h(n)  is  a  low-pass  filter  and  g(n)  is  a  high-pass  filter,  then 


^.(n)  =  £/!(/)<!•  j _  j (2« - 0  .and  (EQ  1 1) 

-  j) ,  (EQ  12) 
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where  (|)j_i(2n-i)  is  the  ith  translation  of  (j)j_i(2n),  which  is  a  basis  function  for  scale  j-1, 
and  <l)j(n)  is  a  basis  function  for  scale  j. 

If  we  make  the  translations  and  dilations  of  the  scaling  functions  orthogonal,  then 
it  can  be  shown  that  any  function  x(n)  can  be  decomposed  into  high-pass  and  low-pass 
signal  components,  i.e.. 


xin)  = 

where 


and 


(EQ13) 


(EQ14) 
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(EQ15) 


Here  is  the  inner  product  of  x(n)  and  (j)k(n),  and  djj  is  the  inner  product  of  x(n)  and 
\{rjj.(n),  and  k  indicates  the  translation  index  [Ref  6,  p.  9].  Further,  it  can  be  shown  that  in 
order  for  the  reconstruction  of  the  signal  to  be  perfect,  h(n)  and  g(n)  must  form  a  quadra¬ 
ture  mirror  filter  (QMF)  pair  [Ref  6,  p.  15].  These  fundamental  results  of  wavelet  analysis 
were  developed  independently  in  the  field  of  subband  coding. 

D.  IMPLEMENTATION  WITH  QUADRATURE  MIRROR  FILTERS 

The  basic  building  block  of  discrete  time  wavelet  analysis  is  the  quadrature  mirror 
filter  pair.  This  set  of  filters  allows  us  to  decompose  a  signal  into  low-pass  and  high-pass 
signal  components,  decimate  the  resulting  signals,  and  then  to  reconstruct  the  original 
signal  from  these  components  perfectly.  A  block  diagram  of  this  scheme  is  shown  in  Figure 
10,  and  the  frequency  response  of  a  quadrature  mirror  filter  pair  is  shown  in  Figure  11. 


FIGURE  10.  Block  Dis^ram  of  Analysis  And 
Synthesis  Sections  of  a  Quadrature  Mirror  Filter  Bank 
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1.5, 


Frequency  Response  For  Daubechies-20  Filter 


Response  Of  QMF  Pair 

If  we  continue  to  apply  signal  decomposition  with  quadrature  mirror  filters  on  the 
low-pass  signal  component,  we  obtain  Mallat's  scheme  for  the  discrete  wavelet  transform 
[Ref  6,  p.  11].  The  input  signal  x(n)  is  successively  low-pass  and  high-pass  filtered 
followed  by  downsampling.  The  resulting  functions  cj  and  dj  represent  the  low-pass 
“coarse”  and  high-pass  “detail”  signals,  respectively,  at  each  scale.  From  the  previous 
discussion,  cj  is  a  function  in  space  Vj,  and  is  a  function  in  space  Wj.  Figure  12  shows  a 
representation  of  this  scheme,  which  is  known  as  the  fast  wavelet  transform. 
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cj-1 


FIGURE  12.  Fast  Wavelet 
Transform  Block  Diagram 

E.  TWO-DIMENSIONAL  FAST  WAVELET  TRANSFORM 

The  one-dimensional  discrete-time  wavelet  transform  can  be  extended  to  two 
dimensions  by  assuming  that  the  scaling  and  wavelet  functions  are  separable,  i.e., 

^jim,  n)  =  (EQ  16) 

At  each  scale  j,  we  need  three  wavelet  functions,  corresponding  to  the  cases  where  we 
have  low-pass  frequencies  in  m  and  high-pass  frequencies  in  n,  high-pass  in  m  and  low- 
pass  in  n,  and  high-pass  in  both  variables,  respectively.  These  are  written  as 


n)  =  (j)(m)i)/(n) 

(EQ17) 

n)  -  \)/(ffi)(t)(n) 

(EQ18) 

n)  =  v(/n)\|/(n) 

(EQ19) 

If  M  =  2^,  which  we  shall  assume  throughout  this  thesis,  an  M  x  M  image  will  have 
J  orthogonal  scales.  As  shown  in  Figures  13  and  14,  the  two-dimensional  wavelet  transform 
decomposes  the  signal  into  four  subbands,  k=  1,  2,  3,  and  4,  at  each  scale  j.  The  k  =  1 
subband,  which  is  low-pass  in  both  m  and  n  directions,  is  then  decomposed  further  at  one 
scale  lower.  This  process  may  be  repeated  until  the  signal  at  the  largest  scale  is  a  single 
coefficient.  Dividing  an  image  into  components  of  differing  scale  is  similar  to  tiling  the 
frequency  axis  as  discussed  in  section  m.B.  Subband  energy  is  a  measure  of  the  signal 
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content  in  a  particular  tile.  The  energy  in  a  given  K  x  K  subband  Ej^  is  the  squared  sum  of 
the  wavelet  coefficients  in  that  subband 


K  K 

Ej^  =  €  1. 2. .  .J;k  €  1, 2. 3, 4 .  (EQ  20) 

i  I 

In  Chapter  IV,  we  will  show  that  subband  energy  values  can  be  used  as  features  in  an 
image  recognition  scheme. 


FIGURE  13.  Two-Dimensional 
Fast  Wavelet  Transform 
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Scale  J 
k  =  4 

FIGURE  14.  Decomposition  of  Image  Into  Components  In  Orthc^onal  Subspaces 


F.  MULTIRESOLUTION  CODING  USING  THE  WAVELET  TRANSFORM 

The  two  dimensional  wavelet  transform  can  be  used  to  create  a  multiresolution 
image  compression  scheme  [Ref  7,  p.  26].  Figme  15  shows  the  tiling  used  for  the  proposed 
multiresolution  image  compression  recognition  scheme. 
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FIGURE  15.  Multiresolution  Image  Compression  Using  Wavelet  TVansform 


In  this  scheme,  we  decompose  the  signal  at  five  different  levels  of  resolution,  which 
are  labeled  Rj,  R2,  R3,  R4,  and  R5.  The  feature  vector  for  R^  consists  of  the  wavelet 
coefficients  at  scale  J-4,  R2  consists  of  Rj  enhanced  by  the  wavelet  coefficients  at  scale  J- 
3,  etc.  The  feature  vector  for  R5  is  the  entire  wavelet  decomposition  of  the  test  image,  as 
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shown  in  Figure  15.  The  data  compression  ratio  Q  for  each  resolution  level  Rj  is  a  function 
of  the  resolution  level  used  [Ref  9,  p.  27].  These  ratios  are  obtained  as  a  ratio  of  the  total 
bits  transmitted  to  the  total  number  of  bits  in  the  original  image.  Table  1  shows  the 
compression  ratios  for  each  of  the  resolution  levels. 

Table  1:  COMPRESSION  RATIOS  FOR  RESOLUTION  LEVELS 


Resolution 

Level 

Q 

Ri 

256:1 

R2 

64:1 

R3 

16:1 

R4 

4:1 

Rs 

1:1 

G.  MULTIRESOLUTION  IMAGE  RECOGNITION  SCHEME 

As  stated  in  section  ni.A,  in  order  to  identify  rapidly  and  accurately  images  of 
particular  interest,  we  require  a  way  to  perform  image  recognition  on  images  at  various 
resolution  levels.  One  method  is  to  interpolate  the  low  resolution  image  until  it  is  the  same 
size  as  the  higher  resolution  image.  One  can  then  input  this  image  to  the  recognition 
algorithm  proposed  in  Chapter  II. 

An  alternative  method  is  to  perform  the  comparison  of  the  two  images  in  the 
wavelet  domain.  In  this  technique,  the  distance  between  the  feature  vector  at  resolution 
level  R1-R5  and  a  reference  vector  is  computed.  The  reference  vector  is  obtained  by  taking 
the  wavelet  transform  of  the  reference  image  and  discarding  wavelet  coefficients  at 
resolution  levels  higher  than  the  test  image.  A  block  diagram  of  this  scheme  is  shown  in 
Figure  16. 


23 


FIGURE  16.  Multiresolution  Image  Compression  and  Recognition  Scheme 


The  scheme  assumes  that  the  user  desires  to  transmit  an  image  which  he  knows  can 
be  associated  with  images  stored  in  a  reference  set.  In  this  thesis,  the  images  to  be 
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transmitted  are  one  of  36  alphanumeric  characters.  Before  the  user  transmits  the  image,  the 
total  packet  delay  through  the  transmission  channel  is  measured  to  determine  the  current 
load  on  the  channel.  Based  on  this  measurement,  the  user  selects  a  desired  resolution  level 
R1...R5  for  the  image  to  be  transmitted.  The  wavelet  transform  of  the  test  image  is 
computed  and  wavelet  coefficients  at  scales  higher  than  the  desired  level  of  resolution  are 
discarded.  The  image  is  then  encoded  using  8  or  12  bits  per  wavelet  coefficient  for 
transmission  in  the  noisy  channel.  At  the  receiving  end,  the  received  message  is  decoded 
and  the  received  wavelet  coefficients  are  formed  into  a  vector.  The  distance  between  the 
received  vector  is  compared  with  a  set  of  36  reference  vectors  consisting  of  the  wavelet 
coefficients  of  the  reference  images.  The  received  vector  is  associated  with  the  vector  with 
the  smallest  Euclidean  distance.  Once  this  association  has  been  made,  the  user  can 
determine  whether  the  image  is  of  sufficient  interest  to  request  transmission  of  the 
remaining  wavelet  coefficients. 

H.  SUMMARY 

The  wavelet  transform  decomposes  the  signal  into  orthogonal  components  at 
different  scales,  which  can  be  used  to  form  feature  vectors  for  image  recognition.  The 
signal  energy  in  each  subband  of  the  wavelet  transform  can  also  be  used  to  create  feature 
vectors.  In  the  next  chapter,  we  use  show  the  performance  of  a  multiresolution  image 
compression  and  recognition  scheme  in  the  presence  of  channel  noise. 
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IV.  IMPLEMENTATION  OF  MULTIRESOLUTION  CHARACTER 

RECOGNITION 

In  this  chapter  we  shall  implement  the  proposed  multiresolution  image  recognition 
scheme  and  present  the  results.  Performance  of  the  proposed  algorithm  using  wavelet 
transform  coefficients  at  all  five  resolution  levels  as  elements  of  the  feature  vectors  (see 
Chapter  HI)  is  presented  first.  The  use  of  signal  energy  in  each  subband  of  the  wavelet 
transform  as  elements  of  the  feature  vector  is  presented  next.  The  use  of  Fourier  transform 
coefficients  as  elements  of  the  feature  vectors  to  perform  recognition  on  images  which  are 
linearly  displaced  with  respect  to  the  reference  image  without  first  registering  the  test 
image  is  then  presented.  Finally,  we  show  results  for  each  of  these  cases  in  the  presence  of 
channel  noise. 

The  use  of  text  characters  as  image  objects  was  chosen  for  several  reasons.  First, 
text  characters  are  easy  to  segment,  so  we  did  not  have  to  develop  complicated  low-level 
processing  routines  which  did  not  relate  to  the  main  topic  of  the  thesis.  Second,  since  many 
text  characters  look  very  similar  —  a  0  looks  much  like  an  O,  an  E  looks  much  like  an  F, 
etc.,  text  characters  provide  a  rigorous  test  of  image  recognition  performance.  Finally, 
realistic  data  sets  of  text  characters  are  easy  to  generate. 

A.  IMPLEMENTATION 

We  shall  begin  by  describing  implementation  details  for  each  block  in  the  image 
recognition  scheme  proposed  in  Figure  2.  The  36  images  of  text  characters  we  shall  use  in 
this  thesis  were  obtained  by  typing  the  capital  letters  A-Z  and  numbers  0-9  using  the  OCR- 
A  font  in  a  standard  word  processing  program.  The  letters  were  then  printed  using  a 
standard  laser  printer  and  scanned  using  a  flatbed  scanner  and  a  commercially  available 
image  processing  program.  To  convert  the  scanned  image  into  a  form  suitable  for 
processing  in  Matlab,  the  scanner  output  was  first  saved  in  JPEG  format,  then  converted  to 
TIFF  format,  and  finally  converted  to  a  64  level  grayscale  image  in  Matlab.  This  character 
set  is  shown  Figure  17. 
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FIGURE  17.  Reference  Character  Set 

The  scanned  letters  in  Figure  17  have  sharp  edges,  and  there  is  little  noise  to  hinder 
the  recognition  process.  We  tested  the  effect  of  a  3x3  median  filter  for  enhancement,  but 
only  1-3  pixels  changed  value  during  the  filtering  process,  so  we  judged  this  to  be 
insignificant.  Also  considered  were  edge  sharpening  filters,  but  it  was  determined  that  this 
might  create  unintended  artifacts  and  would  not  produce  discernible  enhancement  with 
such  high  quality  images. 

The  image  in  Figure  17  was  fed  to  a  segmentation  routine  which  first  identified  each 
line  of  text  by  summing  across  the  rows  of  the  image  and  applying  a  threshold.  If  the  sum 
was  less  than  the  threshold,  the  algorithm  concluded  the  row  consisted  entirely  of 
background;  conversely,  if  the  sum  exceeded  the  threshold,  the  row  contained  part  of  a 
character.  The  threshold  was  determined  based  on  a  histogram  of  the  pixel  values.  The  rows 
identified  as  containing  text  characters  were  then  grouped  together;  if  there  was  a  large 
break  between  identified  rows,  the  algorithm  concluded  that  one  line  of  text  had  ended  and 
another  begim. 

Next,  the  individual  characters  were  parsed  from  each  row  by  applying  another 
threshold.  Plots  of  individual  characters  obtained  from  the  character  set  in  Figure  17  are 
shown  in  Figure  18.  The  computer  code  used  to  generate  these  characters,  cutter.m,  is. 
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given  in  the  Appendix.  These  characters  form  the  set  of  reference  images  for  the 
recognition  system  for  all  trials  in  this  thesis. 


FIGURE  18.  Reference  Character  Set  After  Segmentation 


B.  MULTIRESOLUTION  RECOGNITION  WITH  ZERO  NOISE  ADDED 

1.  Wavelet  Coefficients  as  Feature  Vector  Elements 
To  test  the  proposed  multiresolution  recognition  scheme,  a  set  of  grayscale  test., 
characters  was  created  by  separately  scanning  and  segmenting  several  lines  of  text  using 
the  same  procedure  used  to  create  the  reference  set.  This  set  of  test  characters  is  shown  in 
Figure  19.  Note  that  each  alphanumeric  character  in  the  reference  set  is  represented  in  the 
test  set. 
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The  wavelet  transform  of  each  test  image  and  each  reference  image  was  computed 
and  the  coefficients  placed  into  a  feature  vector  as  described  in  Chapter  HI.  Note  that  the 
size  of  both  the  test  and  the  reference  feature  vectors  are  different  for  each  level  of 
resolution  Ri-Rs-  The  Euclidean  distance  was  computed  between  the  test  vector  and  the 
reference  vector  for  each  image  in  the  reference  set.  The  test  character  was  associated  with 
the  text  character  in  the  reference  set  with  the  smallest  Euclidean  distance  measure.  The 
declared  associations  were  compared  with  the  correct  associations  to  determine  recognition 
performance.  We  ran  this  test  for  the  Daubechies  filter  family  of  length  4, 12,  and  20;  each 
trial  yielded  1.000  (100%)  recognition  performance. 

2.  Subband  Energy  as  Feature  Vector  Elements 

An  alternative  method  which  works  with  even  fewer  bits  transmitted  uses  the 
energy  in  each  subband  of  the  wavelet  transform  as  elements  of  the  feature  vector.  As  given 
by  Equation  20,  the  energy  in  a  subband  is  the  squared  sum  of  wavelet  coefficients  in  that 
subband.  To  compare  images  at  different  levels  of  resolution,  we  ignore  the  energy  in 
small-scale  subbands.  For  example,  if  the  test  image  is  transmitted  at  resolution  level  Rj, 
we  ignore  the  energy  in  subbands  for  the  three  smallest  scales.  Table  2  shows  the  size  of 
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the  wavelet  coefficient  matrix  for  the  compressed  image  at  each  level  of  resolution  and  the 
length  of  the  feature  vector  for  128  x  128  images  such  as  those  used  in  this  thesis. 


TABLE  2;  FEATURE  VECTOR  LENGTH  USING  SUBBAND  ENERGY 


Resolution 

Level 

Size  of 
Wavelet 
Coefficient 
Matrix 

Length  of 
Feature 
Vector 

Ri 

8x8 

9 

Ri 

16  X  16 

12 

R3 

32x32 

15 

R4 

64x64 

18 

R5 

128  X 128 

21 

For  a  length  20  Daubechies  filter  with  zeros  channel  noise  added,  we  observed 
0.9565  recognition  performance  at  resolution  level  Rj,  0.9783  for  R2,  and  1.000 
recognition  performance  for  resolution  levels  R3-R5. 

3.  Fourier  Transform  Coefficients  as  Elements  of  the  Feature  Vectors 

As  discussed  in  Chapter  E,  one  problem  which  complicates  the  image  recognition 
problem  is  classifying  images  which  are  not  in  the  same  spatial  location  as  the  reference 
images.  When  using  spatial  domain  features,  it  is  necessary  first  to  register  the  image  to 
compensate  for  any  translation.  An  alternative  approach  is  to  take  the  Fourier  transform  of 
the  image  and  to  use  the  magnitude  of  the  coefficients  as  features,  which  allows  one  to 
perform  image  recognition  without  first  registering  the  test  image. 
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To  test  the  performance  of  this  method  with  zero  sensor  noise,  a  set  of  test  images 
was  created  from  the  reference  set  by  translating  the  reference  character  away  from  the 
center  by  an  arbitrary  amount.  This  test  set  is  shown  in  Figure  20. 


FIGURE  20.  Translated  Reference  Character  Set 


The  two-dimensional  Fourier  transform  of  each  image  in  this  test  and  the  reference 
set  was  computed.  The  magnitudes  of  the  transform  coefficients  for  each  set  of  characters 
were  formed  into  feature  vectors.  The  measured  distance  between  the  feature  vectors  for 
the  reference  and  test  sets  was  zero  within  the  limits  of  finite  computer  precision  (on  the 
order  of  10'^^. 

To  test  whether  it  is  possible  to  perform  image  recognition  without  registering  the 
image  in  the  presence  of  sensor  noise,  we  translated  each  image  in  the  character  set  shown 
in  Figure  19  by  a  random  amount.  This  test  set  is  shown  in  Figure  21.  This  trial  yielded 
0.9565  recognition  performance  at  resolution  level  R5.  The  algorithm  incorrectly  identified 
a  6  as  a  “9”  and  vice  versa.  Note  that  the  “6”  and  “9”  in  the  reference  set  are  nearly 
identical  except  for  a  180  degree  rotation.  The  measured  recognition  performance  was 
0.9130  at  resolution  level  R3, 0.8261  at  resolution  level  R2,  and  0.3478  at  resolution  level 
Rl. 
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FIGURE  21.  Translated  Character  Test  Set 


C.  MULTIRESOLUTION  RECOGNITION  WITH  CHANNEL  NOISE  ADDED 

1.  Wavelet  Coefficients  as  Elements  of  Feature  Vectors  for  Grayscale  Images 

To  test  the  performance  of  the  proposed  multiresolution  image  recognition  scheme 
under  reahstic  conditions,  it  was  necessary  to  simulate  the  transmission  of  data  over  a  noisy 
transmission  channel.  The  feature  vectors  for  the  grayscale  test  images  used  in  the  previous 
section  were  converted  to  8-bit  binary  data  using  unifonn  quantization  and  Gray  coding. 
To  simulate  errors  due  to  additive  white  Gaussian  noise  in  the  communications  channel, 
random  bit  errors  were  introduced  at  selected  bit  error  rates  (BER)  using  a  random  number 
generator.  Twelve  trials  of  the  simulation  were  performed  for  each  filter  length,  level  of 
resolution,  and  bit  error  rate;  results  presented  are  obtained  by  averaging  the  results  for 
these  twelve  trials.  Figure  22  shows  recognition  performance  versus  bit  error  rate  for 
resolution  levels  Rj,  R2,  R3,  and  R5  for  a  length  4  Daubechies  wavelet  filter.  The  computer 
code  for  this  trial,  wavetest.m,  is  given  in  the  Appendix.  Results  for  resolution  level  R4  was 
similar  to  that  for  R5  and  is  excluded  for  the  sake  of  clarity.  Figures  23  and  24  show  results 
for  length  12  and  length  20  Daubechies  wavelet  filters,  respectively.  With  respect  to  the 
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decrease  in  recognition  performance  with  increasing  compression  ratios,  the  performance 
shown  in  Figures  23  and  24  is  consistent  with  that  shown  in  Figure  22. 


Daubechies  4  Filter 
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Daubechies  12  Filter 


FIGURE  23.  Performance  with  Length  12 
Daubechies  Filter  on  Grayscale  Image 


Daubechies  20  Filter 


FIGURE  24.  Performance  with  Length  20 
Daubechies  Filter  on  Grayscale  Image 

Recognition  performance  degrades  rapidly  when  the  bit  error  rate  exceeds  lO'^' 
Note  that  the  recognition  performance  falls  with  higher  compression  ratios.  This  agrees 
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with  our  intuition  because  at  higher  compression  ratios,  each  bit  carries  a  higher  percentage 
of  the  information  in  the  image. 

2.  Wavelet  Coefficients  as  Elements  of  Feature  Vectors  for  Binary  Images 

To  show  the  performance  of  a  multiresolution  image  recognition  system  for  binary 
images,  we  tested  the  multiresolution  algorithm  using  a  set  of  images  which  had  been 
converted  to  binary  images  by  histogram  thresholding.  Figures  25-27  show  the  results  for 
length  4,  12,  and  20  Daubechies  filters  at  Ri,  R2,  R3,  and  R5  for  grayscale  images  of  the 
text  characters  shown  in  Figure  19  coded  using  8  bits  per  coefficient.  Again,  the  results  for 
resolution  level  R4  were  similar  to  those  for  resolution  level  R5  and  are  omitted  for  reasons 
of  clarity.  The  recognition  performance  for  binary  images  is  superior  to  that  for  grayscale 
images  and  less  sensitive  to  bit  errors.  We  observe  that  the  feature  vectors  for  binary  images 
are  more  distinct  than  the  feature  vectors  created  using  grayscale  images.  Therefore,  we 
conclude  that  if  the  prominent  features  in  an  image  depend  primarily  on  the  shape  of  the 
object,  as  in  this  thesis,  binary  thresholding  may  be  a  better  approach. 


Daubechies  4  Fitter 
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Daubechies  12  Filter 


Bit  Error  Rate 


FIGURE  26.  Performance  with  Length  12 
Daub6chies  Filter  on  Binary  Image 


Daubechies  20  Filter 


Bit  Error  Rate 


FIGURE  27.  Performance  with  Length  20 
Daubechies  Filter  on  Binary  Image 


3.  Recognition  Performance  as  a  Function  of  Filter  Length 

We  also  observed  that,  in  general,  longer  filter  lengths  yielded  better  performance 
for  a  given  bit  error  rate  at  a  given  level  of  resolution  for  both  binary  and  grayscale  images. 
Longer  filter  lengths  give  better  performance  because  there  is  less  aliasing  in  the  subbands 
due  to  the  sharper  rolloff  in  frequency  response  near  the  cutoff  frequency,  as  shown  in 
Figure  28.  Less  aliasing  results  in  less  spillover  of  signal  content  from  one  subband  into 
another,  thus  driving  the  feature  vectors  further  apart. 


Figures  29-31  show  performance  vs.  BER  for  R2,  and  R3  for  length  4,  12,  and 

20  Daubechies  wavelet  filters;  R4  and  R5  both  showed  results  near  1.000  for  each  filter 
length. 

One  factor  which  counters  the  increase  in  performance  with  longer  filter  lengths  is 
the  effect  of  the  length  of  support  for  the  signal.  If  a  signal  is  zero  outside  of  some  range 
between  pomts  A  and  B,  the  distance  AB  is  said  to  be  the  support  of  the  signal.  The  output 
of  a  wavelet  filter  tends  to  have  a  sharp  response  when  the  support  of  the  signal  is  as  long 
as  the  support  of  the  filter.  This  is  similar  to  the  matched  filter  effect.  Note  that  in  Figure 
29,  for  which  the  signals  are  most  compressed  and  thus  have  the  shortest  support,  the  length 
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4  Daubechies  wavelet  filter  outperforms  the  longer  filters.  In  Figures  30  and  31,  this  effect 
is  less  prominent. 


FIGURE  30.  Performance  for  Daubechies  Wavelet 
Filters  of  Length  4, 12,  and  20  at  Resolution  Level  R2 
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FIGURE  31.  Performance  for  Daubechies  Wavelet 
Filters  of  Length  4, 12,  and  20  at  Resolution  Level  R3 


4.  Recognition  Performance  as  a  Function  of  Bit  Resolution 

To  assess  the  effect  of  bit  resolution  in  the  performance  of  the  multiresolution 
scheme,  we  repeated  the  procedure  discussed  in  the  previous  section  using  12  bits  per 
coefficient.  Due  to  very  large  computational  and  memory  requirements  for  this  test,  we 
conducted  only  one  trial  for  each  BER  and  filter  length.  Figures  32-34  (single  trial)  show 
that  recognition  performance  is  more  robust  in  the  presence  of  channel  noise  with  12  bit 
coding  as  compared  to  the  performance  shown  in  Figures  25-27  using  8  bit  coding 
(averaged  over  12  trials). 
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Perfotmance 


Daubechies  4  Filter 


FIGURE  32.  Performance  for  Daubechies  4  FUter  at 
Resolution  Levels  Rl,  R2,  R3,  and  R5  with  12  bit  coding 


Daubechies  12  Filter 


FIGURE  33.  Performance  for  Daubechies  12  Filter  at 
Resolution  Levels  Rl,  R2,  R3,  and  R5  with  12  bit  coding 
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Daubechies  20  Filter 


FIGURE  34.  Performance  for  Daubechies  20  Filter  at 
Resolution  Levels  Rl,  R2,  R3,  and  R5 


5.  Subband  Energy  as  Elements  of  the  Feature  Vectors 

Figures  35-37  show  the  performance  for  multiresolution  image  recognition  using 
subband  energy  as  feature  vectors  in  the  presence  of  channel  noise.  Computer  code  for  this 
method,  bandtestm,  is  given  in  the  Appendix. 
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Performance 


Daubechies  4  Filter 


FIGURE  35.  Perfomance  of  a  Length  4  Daubechies 
Filter  Using  Subband  Enei^y  As  Features 


Daubechies  12  Filter 


FIGURE  36.  Perfomance  of  a  Length  12  Daubechies 
Filter  Using  Subband  Enei^y  As  Features 
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Daubechies  20  Fitter 


FIGURE  37.  Perfomance  of  a  Length  20  Daubechies 
Filter  Using  Subband  Energy  As  Features 


Recognition  performance  using  this  method  decreases  rapidly  in  the  presence  of 
channel  noise.  Since  each  feature  vector  element  represents  a  higher  proportion  of  the  total 
information  transmitted,  it  is  reasonable  that  bit  errors  introduced  by  the  channel  would 
have  a  greater  effect  on  the  recognition  performance  than  when  using  the  wavelet 
coefficients  as  feature  vectors. 

6.  Fourier  Transform  Coefficients  as  Elements  of  the  Feature  Vectors 

Combining  translation  invariant  recognition  with  multiresolution  analysis,  we  first 
computed  the  wavelet  transform  of  the  image  and  then  compute  the  two-dimensional 
Fourier  transform  of  the  wavelet  coefficients  in  each  subband  used  at  the  desired  level  of 
resolution.  For  example,  for  resolution  level  R^,  we  decomposed  the  image  using  the 
wavelet  transform  and  then  computed  the  two  dimensional  Fourier  transform  of  each 
subband  of  the  8  x  8  matrix  of  coefficients  used  in  Rj.  The  magnitudes  of  the  resulting 
Fourier  coefficients  were  formed  into  a  test  vector  for  transmission.  Computer  code  for  this 
trial,  transtestm  is  presented  in  the  Appendix.  Figures  38-41  show  recognition 
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performance  for  this  technique  in  the  presence  of  channel  noise.  Performance  using  this 
technique  is  lower  than  the  recognition  performance  for  images  which  have  not  undergone 
translation.  As  in  the  case  with  no  noise,  the  results  for  resolution  level  Rj  are  much  worse 
than  the  performance  for  images  at  higher  levels  of  resolution. 


BER 

FIGURE  38.  Performance  for  Daubechies  Filters  of 
Length  4, 12,  and  20  at  Resolution  Level  R1  with  Linear 
Translation 
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FIGURE  39.  Performance  for  Daubechies  Filters  of 
Length  4, 12,  and  20  at  Resolution  Level  R2  with  Linear 
Translation 
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FIGURE  40.  Performance  for  Daubechies  Filters  of 
Length  4, 12,  and  20  at  Resolution  Level  R3  with  Linear 
Translation 


10"'  10'®  10'^  10’’ 

BER 


FIGURE  41.  Performance  for  Daubechies  Filters  of 
Length  4, 12,  and  20  at  Resolution  Level  R5  with  Linear 
Translation 


D.  MULTIRESOLUTION  RECOGNITION  WITH  AIRCRAFT  LINE  IMAGES 

To  show  that  multiresolution  image  recognition  can  be  used  as  part  of  an  automatic 
target  recognition  scheme,  we  created  the  following  test  set  by  scanning  clip  art  images 
from  a  standard  image  processing  application.  This  test  set  is  shown  in  Figure  42. 
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FIGURE  42.  Aircraft  Line  Drawing  Test  Set 

Using  the  test  set  shown  in  Figure  42  and  a  separately  scanned  reference  set,  we 
tested  multiresolution  image  recognition  performance  with  wavelet  coefficients  as  feature 
vectors.  The  computer  code  for  this  trial,  planetestm,  is  given  in  the  Appendix.  With  no 
channel  noise  added,  the  scheme  produced  1.000  recognition  performance  at  all  resolution 
levels  and  for  each  filter  length.  With  channel  noise  added  to  simulate  random  errors  caused 
by  additive  white  Gaussian  noise  in  the  transmission  channel,  the  scheme  produced  1.000 
recognition  performance.at  resolution  levels  R2,  R3,  and  R4  for  each  filter  length.  At 
resolution  level  R^,  we  obtained  the  performance  shown  in  Figure  43. 
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FIGURE  43.  Recogntion  Performance  vs.  BER  for 
Aircraft  Test  Set  for  Compression  Ration  of  256:1 


Again,  the  degradation  in  recognition  performance  for  high  compression  ratios  with 
increasing  channel  noise  results  from  the  fact  that  each  bit  carries  a  higher  percentage  of 
the  total  image  information.  The  recognition  performance  at  resolution  levels  R2-R5  in  this 
example  is  superior  to  that  obtained  for  the  text  characters  for  two  reasons.  First,  the  line 
drawings  are  more  distinct  than  the  text  characters,  which  means  that  the  feature  vectors 
obtained  from  them  are  further  apart.  Second,  the  size  of  the  reference  set  is  smaller, 
reducing  the  chance  of  erroneous  identification. 


E.  SUMMARY 

The  wavelet  transform  can  be  used  to  create  feature  vectors  to  perform 
multiresolution  image  recognition  on  grayscale  and  binary  images.  Recognition 
performance  for  different  resolution  levels,  bit  error  rates,  and  filter  lengths  for  grayscale 
and  binary  images  is  summarized  in  Tables  3  and  4. 
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Table  3:  MULTIRESOLUTION  IMAGE  RECOGNITION  PERFORMANCE  FOR 
GRAYSCALE  IMAGES  USING  WAVELET  COEFFICIENTS  FOR 

FEATURE  VECTORS 


Filter  Length 

Rl 

R2 

RS 

% 

Q 

256:1 

64:1 

16:1 

1:1 

BER  =  0.05 

4 

0.2120 

0.3714 

0.5670 

BB 

12 

0.1712 

0.4366 

0.5390 

20 

0.1739 

0.4293 

0.6341 

0.9150 

BER  =  0.02 

4 

0.3750 

0.4130 

0.5851 

0.8874 

12 

0.2717 

0.4746 

0.6920 

0.9130 

20 

0.2636 

0.4728 

0.7264 

0.9269 

BER  =  10"2 

4 

0.5571 

0.6341 

■Ml 

12 

0.4212 

WBM 

0.7029 

20 

0.4293 

0.7446 

0.9150 

BER  =  10'^ 

4 

0.9266 

0.8605 

0.9547 

12 

0.8995 

0.8641 

0.9438 

WmM 

20 

0.8967 

0.8496 

0.9239 

m 

BER  =10'^ 

4 

0.9973 

mSm 

12 

0.9946 

20 

0.9946 

B 
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Table  4:  MULTIRESOLUTION  IMAGE  RECOGNITION  PERFORMANCE 
FOR  BINARY  IMAGES  USING  WAVELET  COEFFICIENTS  FOR 

FEATURE  VECTORS 


Filter  Length 

Rl 

R2 

% 

% 

Ci 

256:1 

64:1 

16:1 

1:1 

BER  =  0.05 

4 

■sn 

0.9040 

12 

mBm 

WBm 

0.9656 

WBmm 

20 

n 

■I 

0.9583 

B 

BER  =  0.02 

4 

0.9696 

BBI 

12 

■SB 

0.9783 

20 

0.7391 

0.9783 

H 

0.9438 

BER  =  10'^ 

4 

0.8804 

mmm 

12 

0.8533 

■m 

BUM 

20 

0.8533 

0.9870 

B 

B 

BER  =  10’^ 

4 

WESM 

0.9909 

12 

1.0000 

0.9891 

20 

■ 

1.0000 

B 

0.9909 

BER  =  10*^ 

4 

0.9982 

1.0000 

1.0000 

12 

0.9982 

1.0000 

1.0000 

20 

1.0000 

1.0000 

1.0000 

If  the  prominent  features  of  an  image  depend  on  shape,  as  in  the  images  used  in  this 
thesis,  binary  thresholding  is  an  important  preprocessing  step.  Recognition  performance 
varies  with  the  bit  error  rate.  Longer  filter  lengths  generally  provide  better  performance,  but 
as  the  support  of  the  signal  decreases  with  increasing  compression,  shorter  filter  lengths  can 
provide  better  results.  Performance  degrades  as  the  compression  ratio  increases,  but  it  is 
possible  to  achieve  1.000  recognition  performance  if  the  probability  of  bit  error  is  less  than 

10'^.  If  Fourier  transform  coefficients  are  used  as  elements  of  the  feature  vectors,  it  is 
possible  to  perform  multiresolution  image  recognition  of  images  which  have  undergone 
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linear  translation  without  registering  the  image.  Performance  degradation  for  compression 
ratios  higher  than  64:1  is  particularly  severe.  Multiresolution  image  recognition  can  be 
extended  to  images  other  than  text  characters  and  could  be  used  as  part  of  an  automatic 
target  recognition  system. 
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V.  CONCLUSIONS 


A  scheme  for  multiresolution  image  recognition  was  presented.  The  scheme  allows 
a  user  to  adjust  the  flow  of  data  over  a  digital  communications  channel  by  varying  the 
desired  resolution  of  the  transmitted  image.  The  transmission  of  grayscale  text  character 
images  at  several  levels  of  resolution  over  a  noisy  digital  channel  was  simulated,  and  plots 
of  the  recognition  performance  versus  bit  error  rate  were  presented.  Fourier  coefficients 
were  used  to  perform  multiresolution  image  recognition  on  images  which  had  undergone 
linear  translation  without  first  registering  the  image.  An  extension  to  the  use  of  binary  text 
characters  and  to  aircraft  line  drawings  was  made. 

Recognition  performance  degraded  as  bit  error  rate  increased  and  as  the 
compression  ratio  increased.  Longer  filter  lengths  provided  better  performance  at  higher 
levels  of  resolution,  but  shorter  filter  lengths  performed  better  at  a  compression  ratio  of 
256: 1.  Results  for  data  coded  at  12  bits  per  coefficient  were  superior  to  those  for  data  coded 
at  8  bits  per  coefficient.  Performance  for  binary  images  exceeded  that  for  grayscale  images, 
indicating  that  conversion  to  binary  via  histogram  thresholding  may  be  a  better  approach 
for  recognizing  images  whose  features  depend  primarily  on  shape.  Results  for  aircraft  line 
drawings  were  superior  to  those  for  either  the  grayscale  or  binary  text  characters. 

This  scheme  is  useful  in  any  situation  in  which  image  recognition  is  critical  and  the 
available  bandwidth  is  limited  and  subject  to  change  according  to  the  traffic  on  the  channel. 
Typical  applications  include  the  recognition  of  images  transmitted  from  a  remote  sensing 
platform  and  searching  an  image  archive  at  a  remote  site. 

Future  research  is  needed  to  extend  the  proposed  approach  to  different  types  of 
images,  such  as  real  images  of  military  targets,  medical  images,  and  human  faces. 
Recognition  performance  in  the  presence  of  sensor  noise,  image  rotation,  and  image  aspect 
variation  are  other  important  areas  of  research. 
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APPENDIX  A 


Matlab  code  implementing  the  multiresolution  image  recognition  algorithms 
presented  in  this  thesis  is  provided  below. 
aconv.m 

function  y  =  aconv(f^) 

%  aconv  -  Convolution  Tool  for  Two-Scale  Transform 
%  Usage 
%  y  =  aconv(f,x) 

%  Inputs 
%  f  filter 
%  X  1-d  signal 
%  Outputs 
%  y  filtered  result 
% 

%  Description 

%  Filtering  by  periodic  convolution  of  x  with  the 
%  time-reverse  of  f. 

% 

%  See  Also 

%  iconv,  UpDyadHi,  UpDyadLo,  DownDyadHi,  DownDyadLo 
% 

n  =  length(x); 
p  =  length(f); 
ifp<n, 

xpadded  =  [x  x(l:p)]: 
else 

z  =  zeros(l,p): 
for  i=l:p, 

imod  =  1  +  tem(i-l,n): 
z(i)  =  x(imod); 
end 

xpadded  =  [x  z]; 
end 

fflip  =  reverse(f); 

ypadded  =  filter(fflip,l,xpadded); 

y  =  ypadded(p:(n+p-l)); 

% 

%  Copyright  (c)  1993.  David  L.  Donoho 
% 


% 

%  Part  of  WaveLab  Version  .700 
%  Built  Friday.  December  8, 1995  8:36:37  PM 
%  This  is  Cc^yrighted  Material 
%  For  Copying  pennissitMis  see  COPYING.m 
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%  Comments?  e-mail  wavelab@playfair.staiiford.edu 
% 

bandtestm 

%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%Determine  performance  of  multiresolution  image  recognition  system 
%using  subband  energy  as  feature  vectors 
%%%%%%%%%%%%%%%%%%%%%%%%%%% 
clear  all; 

loadgrayref;%Load  reference  images 
loadgraytest:%Load  test  images 
coktum  =  6;%Initialize  global  variables  for  simulation 
global  START_OK; 

START_OK=  1; 

Pgraybandl  =  zeros(5,3);%Initialize  output  variable  -  Overall  correct  recocnition 

for  trials  =  1: 12;%Run  simulation  for  12  trials 

Pcorr  =  zeros(5,3);%Initialize  intermediate  ou^iut  variables 

m  =  zeros(46.36);%Initialize  Euclidean  distance  measure  matrix 

b  =  [4  12  20];%Run  for  3  filter  lengths 

biterr  =  [.05  .02 .01 .001 .0001];%Run  for  5  bit  error  probabilities 

%%%%%%%% 

%Main  program 

%Perform  simulated  transmission  over  noisy  charmel  for  each  filter  length  and 
%bit  error  probabihty  for  this  level  of  resolution  (Rl) 

for  u2  =  l:length(biterr);%Run  for  all  bit  error  probabilities 
thisbiterr  =  biterr(u2);%P(b)  for  this  trial 
for  u  =  l:length(b);%Run  for  all  filter  lengths 
count2  =  l:%test  image  index 

y  =  zeros(l,46);%store  identified  text  character  in  this  vector 

qmf  =  MakeONFilter(‘Daubechies’,b(u));%Cteate  daubechies  filter  for  this  trial 

for  k  =  l:5;%Run  for  all  test  characters 
for  j  =  1:13; 

%load  test  character 

if  exist([‘graytest’  num2str(k)  num2str(j)])  ==  0; 
break: 

else  t  =  eval([‘graytest’  num2str(k)  niim2str(j)]); 

%Compare  t  with  all  reference  characters 
count  =  1; 
forp=  1:5; 
forq=  1:8; 
ifp==5&q>4: 
break; 

else  r  =  eval([‘grayref  num2str(p)  num2str(q)]); 
r  =  FWT2_PO(r,l,qmf):%Compute  wavelet  transform  of  reference  image 
r  =  bandenergy2(r);%Compute  feature  vector  using  subband  energy 
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if  count  ==  1;%E  this  is  the  first  reference  image 
t  =  FWT2_PO(t,l,qmf);%compute  wavelet  transform  of 
%test  image 

t  =  bandenergy2(t);%Compute  feature  vector  using  subband  energy 
t  =  quantize(t,8/scale’);%Quantize  uniformly  using  8  bits 
%per  coefficient 

t  =  bin_enc(t,8);%Encode  using  natural  binary  code 
t  =  bin2gray(t);%Convert  to  gray  coding 
%Iatroduce  random  errors  using  random  number  generator  with 
%probability  =  thisbiterr 


noise  =  zeros(size(t)); 

bitchange  =  fmd(rand(size(t))  <  thisbiterr); 

noise(bitchange)  =  ones(size(bitchange)); 

%Corrupt  transmitted  signal  t  with  random  bit  errors  found  above 
t  =  xorCtjnoise); 

t  =  gray2bin(t);%convert  to  natural  binary  code 
t  =  bin_dec(t);%decode  signal 


end; 

m(count2,count)  =  (r-t)*(r-t)’;%Compute  distance  between  reference 
%and  test  vectors 

count  =  count  +  l;%increment  reference  vector  index 
end; 
end; 
end; 
end; 

[n  j]  =  min(m(count2,:));%Detennine  index  of  reference  image  with  minimum 
%Euclidean  distance 

y(coimt2)  =  num21et(j);%Convert  to  ASQDE  character 
count2  =  count2+l;%Increment  test  vector  index 
end; 
end; 

%Compare  results  obtained  this  trial  with  correct  answer 

ideal  =  VT  ‘0’  ‘9’  ‘5’  ‘6’  ‘Q’  ‘U’  T  ‘C’  ‘K’  T’  ‘O’  ‘X’  ‘E’  ‘S’  ‘J’  ‘U’  ‘M’  ‘P’  ‘E’  ‘D’  ‘O’ 


‘R’  ‘T’  ‘H’  ‘E’  ‘1’  ‘3’  ‘4’  ‘8’  ‘7’  ‘L’  ‘A’  ‘Z’  ‘Y’  ‘B’  ‘R’  ‘O’  ‘W’  ‘N’  ‘D’  ‘O’  ‘G’  ‘S’]; 

Pcorr(u2,u)  =  1  -  length(find(y  -  ideal))/length(ideal);%Detennine  error  performance 

end; 

end; 

Pgraybandl  =  Pgraybandl  +  Pcorr%Determine  overall  performance  as  sum  of  all  trials 

save  Pgraybandl  Pgraybandl%Store  overall  performance 

end; 


cuttenm 

%%%%%%%%%%%%%%%%%%%%%%% 

%Preprocessing 

%%%%%%%%%%%%%%%%%%%%%%% 
clear  all; 

load  ocralastref;%stored  reference  image  for  OCR- A  data 
Ipf  =  [.25*ones(l,3);  .25  1 .25;  .25*ones(l,3)]/3;%Low  Pass  Filter 
X  =  filter2(lpf,x);%Use  Low-Pass  Filter  on  Image 
X  =  medfilt2(x,t3  3]);%Use  Median  Filter  on  Result 


‘V’  ‘E’ 
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%%%%%%%%%%%%%%%%%%%%%%% 

%Identify  Lines  of  Text 
%%%%%%%%%%%%%%%%%%%%%%% 

[m,n]  =  size(x); 

linex  =  □:%Stores  identified  lines  of  text 
liney  =  [];%Stores  identified  columns  of  text 
y  =  sum(x’);%Sum  down  rows 
linex  =  find(y  >  2.2E5);%Apply  threshold 
linex2  =  [linex(2;length(linex))  0]; 

[n,j]  =  fmd((linex  -  linex2)  <=  -10);%If  there  is  a  large  break  between 

%identified  rows  of  text,  make  a  new 

%row 

j2=j+l; 

linfmdxstart  =  linex(j);%Identifies  beginning  of  row 
linfindxend  =  linex(j2);%Identifies  end  of  row 


%Geate  a  matrix  of  all  identified 

for  k  =  l:Deiigth(linfindxstart)];%lines  of  text 

eval([‘set’  num2str(k)  ‘  =  x(Dinfindxstart(k):lmfindxend(k)],:);  *]); 

end; 

%%%%%%%%%%%%%%%%%%%%%%%% 

%Parse  individual  characters  from  lines  of  text 
%%%%%%%%%%%%%%%%%%%%%%%% 
count  =  0;  %Total  number  of  Characters 
for  k  =  l:length(lmfindxstart): 
if  count  >  46;%Quit  when  count  =  46; 
break; 
end; 

y  =  sum(eval([‘set’  num2str(k)]));%Work  on  the  kth  row  of  flie  original  image 
liney  =  find(y  >  1.5E4);%Apply  Threshold 
liney2  =  [liney(2:length(liney))  0]; 

[nj]  =  find((liney  -  liney2)  <=  -10);%If  there  is  a  sharp  break  between  identifed 
%columns,  start  a  new  character 
j2=j+l; 

linfmdystart  =  liney(j);%Stores  the  starting  column  for  character 
linfindyend  —  liaey(j2);%Stores  the  last  column  for  die  character 

for  V  =  l:[lengthOiiifindystart)];%Separate  out  and  store  the  individual  characters 
%in  a  square  matrix  of  dyadic  size 
count  =  count  +  1; 

t  =  eval([‘set’  num2str(k)  ‘(:,(linfindystart(v);lmfindyend(v)))’]); 
eval([‘ocralastref  num2str(k)  num2str(v)  ‘  =  t;’]); 

%Determine  the  largest  dyadic  number  greater  than  the 
%maximum  dimension  of  the  image 

[njcol]  =  dyadlength(eval([‘ocralastref  num2slrCk)  num2str(v)  ‘(LO’])); 
[njrow]  =  dyadlength(eval([‘ocralastref  num2str(k)  num2str(v)  ‘C.l)’])); 
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Jref  =  max(Jcol  Jrow); 
t  =  [‘ocralastref  num2str(k)  nuni2str(v)]; 
t  =  eval(t); 

colpad  =  2^Jref  •  size(t,2);%Nuniber  of  pixels  to  pad  columns 
rowpad  =  2^Jref  -  size(t,l);%Niimber  of  pixels  to  pad  rows 


colpadleft  =  floor(colpad/2);%Pad  rows  and  columns 
colpadright  =  colpad  -  colpadleft; 
rowpadtop  =  floor(rowpad/2); 
rowpadbot=  rowpad  -  rowpadtop; 

eval([‘ocralastief  num2strCk)  niun2str(v)  ‘  =  [254*ones(size(t,l),colpadleft)  t 

254*ones(si2e(t,l),colpadright)];’]); 

t  =  eval([‘ocralastief  num2str0c)  num2str(v)]); 

eval([‘ocralastief  num2str(k)  num2str(v)  ‘  =  [254*ones(rowpadtop,size(t,2));  t; 

254*ones(rowpadbot,size(t,2))];  ’]); 


t  =  eval([‘ocralastief’  num2str(k)  niim2str(v)]); 

t  =  t/sqrt(sum(sum(t.'^2)));%Normalize  resulting  image  to  have  energy  =  1; 

eval([‘grayref  num2str(k)  num2str(v)  ‘  =  t;’]);%Store  the  result  in  its  own  file 
eval([‘save  grayref  num2str(k)  num2str(v)  ‘  grayref  num2str(k)  num2str(v)]); 
end; 


end; 

end; 

downdyadhijn 

function  d  =  DownDyadHi(x,qmf2) 

%  DownDyadHi Hi-Pass  Downsampling  operator  (periodized) 
%  Usage 

%  d  =  DownDyadHi(x4) 

%  Inputs 

%  X  1-d  signal  at  fine  scale 
%  f  filter 
%  Outputs 

%  y  1-d  signal  at  coarse  scale 
% 

%  See  Also 

%  DownDyadLo,  UpDyadHi,  UpDyadLo,  FWT_PO,  iconv 

% 

d  =  aconv(qmf2,x); 
n  =  length(d); 
d  =  d(l:2:(n-l)); 


% 

%  Copyright  (c)  1993.  Iain  M.  Johnstone 
% 
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% 

%  Part  of  WaveLab  Version  .700 

%  Built  Friday,  December  8, 1995  8:36:37  PM 

%  This  is  Copyrighted  Material 

%  For  Copying  permissions  see  COPYING.m 

%  Comments?  e-mail  wavelab@playfair.stanford.edu 

% 

downdyadlo.m 

function  d  =  DownDyadLo(x,qmf  1) 

%  DownDyadLo  --  Lo-Pass  Downsampling  operator  (periodized) 
%  Usage 

%  d  =  DownDyadLo(xi) 

%  Inputs 

%  X  1-d  signal  at  fine  scale 
%  f  filter 
%  OuQ)uts 

%  y  1-d  signal  at  coarse  scale 

% 

%  See  Also 

%  DownDyadHi,  UpDyadHi,  UpDyadLo,  FWT_PO,  aconv 
% 

d  =  aconv(qmfl,x); 
n  =  length(d): 
d  =  d(l:2:(n-l)); 


% 

%  Copyright  (c)  1993.  Iain  M.  Johnstone 
% 


% 

%  Part  of  WaveLab  Version  .700 

%  Built  Friday,  December  8, 1995  8:36:37  PM 

%  This  is  Copyrighted  Material 

%  For  Copying  permissions  see  COPYING.m 

%  Comments?  e-mail  wavelab@playfair.stanford.edu 

% 


dyadlength.m 

fimction  [n  J]  =  dyadlength(x) 

%  dyadlength  --  Find  length  and  dyadic  length  of  array 
%  Usage 

%  [n  J]  =  dyadlength(x) 

%  hiputs 

%  X  array  of  length  n  =  2AI  (hopefully) 
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%  Outputs 
%  n  length(x) 

%  J  least  power  of  two  greater  than  n 
% 

%  Side  Effects 

%  A  warning  is  issued  if  n  is  not  a  power  of  2. 
% 

%  See  Also 

%  quadlength,  dyad,  dyad2ix 
% 

n  =  length(x) ; 

J  =  ceilflog(n)/log(2)); 

%if2^J-'=n, 

%  dispCWaming  in  dyadlength:  n  !=  2^J’) 
end 


% 

%  Part  of  WaveLab  Version  .700 

%  Built  Friday,  December  8, 1995  8:36:37  PM 

%  This  is  Copyrighted  Material 

%  For  Copying  permissions  see  COPYING.m 

%  Comments?  e-mail  wavelab@playfair.stanford.edu 

% 


mid2,iii 

%function  fftid2:  takes  the  2  -  dimensional  Fourier  transform  of  each  subband 
%of  the  wavelet  coefficient  matrix  wc  and  places  the  normalized  Fourier 
%coefficient  magnitudes  into  a  feature  vector 

function  idvector  =  fftid2(wc); 

idvector  =  0;%lnitialize  output  vector 

[nj]  =  quadlength(wc);%Detennine  dyadic  size  of  square  array 

for  k  =  J:-l:l;%Peifonn  for  each  scale 
high  =  (2'^(k-l)  +l):2^k;%High-pass  range 
low  =  l:2^(k-l);%Low-pass  range 
N  =  length(low)^2;%maximum  dimension 

%For  each  of  the  three  subbands  at  this  scale,  take  the  2- 
%dimensional  Fourier  transform  and  append  the  magnitude  of  each  onto 
%a  feature  vector 

idvector  =  [idvector  reshape(abs(fft2(wc(high,low))),l,N)]; 
idvector  =  [idvector  reshape(abs(fft2(wc(low,high))),lJN)]; 
idvector  =  [idvector  reshape(abs(fft2(wc(highdiigh))),l,N)]; 
end; 

%Add  the  (1,1)  wavelet  coefficient 
idvector  =  [idvector  wc(l)]; 

%Normalize  the  resulting  vector 
idvector  =  idvector/length(idvector); 
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idvector  =  idvector/sqrt(suin(idvectQr.^2)); 

FWT2_PO.m 

function  wc  =  FWT2_P0(xJL,qmf) 

%  FWT2_P0  --  2-d  MRA  wavelet  transfonn  (periodized,  orthogonal) 
%  Usage 

%  wc  =  FWT2_P0(xJL,qmf) 

%  Inputs 

%  X  2-d  image  (n  by  n  array,  n  dyadic) 

%  L  coarse  level 
%  qmf  quadrature  mirror  filter 
%  Ouq)uts 

%  wc  2-d  wavelet  transform 
% 

%  Description 

%  A  two-dimensional  Wavelet  Transform  is  computed  for  the 
%  array  X.  To  reconstruct,  use  IWT2_PO. 

% 

%  See  Also 

%  IWT2_PO,  MakeONFilter 
% 

[nj]  =  quadlength(x); 
wc  =  x; 
nc  =  n; 

forjscal=J-l:-l:L, 

top  =  (nc/2-i-l):nc;  bot=  l:(nc/2); 

for  ix=l:nc, 

row  =  wc(ix,l:nc); 

wc(ix,bot)  =  downdyadlo(row,qmf); 

wc(ix,top)  =  downdyadhi(row,qmf); 

end 

for  iy=l:nc, 

row  =  wc(l:nc,iy)’: 

wc(tqp,iy)  =  downdyadhi(row,qmf)’; 

wc(bot,iy)  =  downdyadlo(row,qmf)’; 

end 

nc  =  nc/2; 
end 


% 

%  Copyright  (c)  1993.  David  L.  Donoho 
% 


% 

%  Part  of  WaveLab  Version  .700 

%  Built  Friday,  December  8. 1995  8:36:37  PM 

%  This  is  Cq)yrighted  Material 

%  For  Copying  permissions  see  COPYING.m 

%  Comments?  e-mail  wavelab@playfair.stanfoid.edu 

% 
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iconv.111 

function  y  =  icanv(f,x) 

%  iconv  --  Convolution  Tool  for  Two-Scale  Transform 
%  Usage 
%  y  =  iconv(f^) 

%  Inputs 
%  f  filter 
%  X  1-d  signal 
%  Outputs 
%  y  filtered  result 
% 

%  Description 

%  Filtering  by  periodic  convolution  of  x  with  f 
% 

%  See  Also 

%  aconv,  UpDyadHi,  UpDyadLo,  DownDyadHi,  DownDyadLo 
% 

n  =  length(x); 
p  =  lengfh(f): 
ifp<=n, 

xpadded  =  [x((n+l-p):n)  xj; 
else 

z  =  zeros(l,p); 
fori=l:p, 

imod  =  1  +  rem(p*n  -p  +  i-lji); 

z(i)  =  x(unod); 

end 

xpadded  =  [z  x]; 
end 

ypadded  =  filter(f,l,xpadded); 
y  =  ypadded((p-t-l):(n-Hp)); 

% 

%  Copyright  (c)  1993.  David  L.  Donoho 

% 


% 

%  Part  of  WaveLab  Version  .700 

%  Built  Friday.  December  8, 1995  8:36:37  PM 

%  This  is  Copyri^ted  Material 

%  For  Copying  permissions  see  COPYINGm 

%  Comments?  e-mail  wavelab@playfair.stanford.edu 

% 


loadgrayrefim 

%%%%%%%%%%%%%% 

%loadgrayref  loads  the  grayscale  reference  images 

%%%%%%%%%%%%%% 

fork  =1:5; 
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for  j  =  1:8; 
if  k  =  5  &  j  >  4; 


break; 

elseeval([‘load  grayref’  nuin2str(k)  nimi2str(j)]); 

end; 

end; 

end; 


loadgraytestm 

%%%%%%%%%%%%%% 

%loadgrayref  loads  the  grayscale  reference  images 

%%%%%%%%%%%%%% 

fork=  1:4; 

for  j=  1:13; 

if  exist([‘grayref  num2str(k)  nmn2str(j)  ‘  jnat’])  —  0; 
break; 

elseeval([‘load  grayref  num2str0c)  num2str(j)]); 

end; 

end; 

end; 

MakeONFilterjn 

function  f  =  MakeONFilter(TypeJPar) 

%  MafceONFilter  --  Generate  Orthononnal  QMF  Filter  for  Wavelet  Transform 
%  Usage 

%  qmf  =  MakeONFilter(TypejPar) 

%  hiputs 

%  Type  string,  ‘Haar’,  ‘Beylkin’,  ‘Coiflet’,  ‘Daubechies’, 

%  ‘Symmlet’,  ‘Vaidyanathan’ 

%  Par  integer,  e.g.  if  Type  =  ‘Coiflet’,  Pai=3  specifies 
%  a  Coiflet-3  wavelet 

%  Ouq)uts 

%  qmf  quadrature  mirror  filter 

% 

%  Description 

%  The  Haar  filter  (which  could  be  considered  a  Daubechies-2)  was  the 
%  first  wavelet,  though  not  called  as  such,  and  is  discontinuous. 

% 

%  The  Beylkin  filter  places  roots  for  the  frequency  response  function 
%  close  to  the  Nyquist  frequency  on  the  real  axis. 

% 

%  The  Coiflet  filters  are  designed  to  give  both  the  mother  and  father 
%  wavelets  2*Par  vanishing  moments;  here  Par  may  be  one  of  1 ,2,3,4  or  5. 

% 

%  The  Daubechies  filters  maximize  the  smoothness  of  the  father  wavelet 
%  (or  “scaling  function”)  by  maximizing  the  rate  of  decay  of  its  Fourier 
%  transform.  They  are  indexed  by  their  length.  Par,  which  may  be  one  of 
%  4,6,8,10,12,14,16,18  or  20. 

% 

%  Symmlets  are  the  “least  asymmetric”  compactly-supported  wavelets  with 
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%  maximum  number  of  vanishing  moments,  here  indexed  by  Par,  which  ranges 
%  from  4  to  10. 

% 

%  The  Vaidyanathan  filter  gives  an  exact  reconstraction,  but  does  not 
%  satisfy  any  moment  condition.  The  filter  has  been  optimized  for 
%  speech  coding. 

% 

%  See  Also 

%  FWT_PO,IWT_PO,FWT2_PO,IWT2_PO,WPAnalysis 
% 

%  References 

%  The  books  by  Daubechies  and  Wickerhauser. 

% 

if  strcmp(Type,’Haar’), 
f=[ll]  ./sqrt(2); 
end 


if  strcmp(Type,’Beylkin’). 

f=  [.099305765374.424215360813.699825214057, 

.449718251149-.110927598348-.264497231446... 

.026900308804.155538731877-.017520746267... 

-.088543630623.019679866044.042916387274... 

-.017460408696-.014365807969.010040411845... 

.001484234782-.002736031626.000640485329]; 

end 


if  strcmpCType.’Coiflet’), 

if  Par==l, 

f=[.038580777748-.126969125396-.077161555496... 

.607491641386.745687558934.226584265197]; 

end 

if  Pai==2, 

f  =  [.016387336463-.041464936782-.067372554722... 

.386110066823.812723635450.417005184424... 

-.076488599078-.059434418646.023680171947... 

.005611434819-.001823208871-.000720549445]; 

end 

ifPap==3, 

f=  [-.003793512864.007782596426.023452696142... 

-.065771911281-.061123390003.405176902410... 

.793777222626.428483476378-.071799821619... 

-.082301927106.034555027573.015880544864... 

-.009007976137-.002574517688.001117518771... 

.000466216960-.000070983303-.000034599773]; 

end 

ifPar==4, 

f  =  [.0008923 13668-.001629492013-.007346166328... 
.016068943964.026682300156-.081266699680... 
-.056077313316.415308407030.782238930920... 
.434386056491-.066627474263-.096220442034... 
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.039334427123.025082261845-.015211731527... 

-.005658286686.003751436157.001266561929... 

-.000589020757-.000259974552.000062339034... 

.000031229876-.000003259680-.000001784985]; 

end 

if  Par==5, 

f  =  [-.000212080863.000358589677.002178236305 

-.004159358782-.010131117538.023408156762... 

.028 168029062-.091920010549-.052043 163216... 

.421566206729.774289603740.437991626228... 

-.062035963906-.105574208706.041289208741... 

.032683574283-.019761779012-.009164231153... 

.006764185419.002433373209-.001662863769... 

-.000638131296.000302259520.000140541149... 

-.000041340484-.000021315014.000003734597... 

.000002063806-.000000167408-.000000095158]; 

end 

end 


if  strampCType.’Daubechies’), 
if  Par==4, 

f  =  [.482962913145.836516303738... 

.224143868042-.129409522551]; 

end 

if  Par==6, 

f  =  [.332670552950.806891509311... 
.459877502118-.135011020010... 
-.085441273882.035226291882]; 
end 

if  Par==8, 

f  =  [  .230377813309.714846570553... 

.630880767930-.027983769417... 

-.187034811719.030841381836... 

.03288301 1667-.010597401785] ; 
end 

if  Pai==10, 

f=  [.160102397974.603829269797.724308528438... 

.138428145901-.242294887066-.032244869585... 

.077571493840-.006241490213-.012580751999... 

.003335725285]; 

end 

if  Pai=12, 

f  =  [.  1 1 1540743350.494623890398.75 1 133908021... 
.315250351709-.226264693965-.129766867567... 
.097501605587.027522865530-.03 15820393 17... 
.000553842201.004777257511-.001077301085]; 
end 

ifPai=l4, 

f  =  [.077852054085.3965393 19482.729132090846... 

.469782287405-.143906003929-.224036184994... 

.071309219267.080612609151-.038029936935... 
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-.016574541631.012550998556.000429577973... 

-.001801640704.000353713800]; 

end 

if  Pai==16, 

£=[.054415842243.312871590914.675630736297... 

.585354683654-.015829105256-.284015542962... 

.000472484574.128747426620-.017369301002... 

-.044088253931.013981027917.008746094M7... 

-.004870352993-.000391740373.000675449406... 

-.000117476784]; 

end 

ifPar==18, 

f  =  [.038077947364.243834674613.604823123690... 

.65728807805 1. 133 197385825-.293273783279... 

-.096840783223.148540749338.030725681479... 

-.067632829061.000250947115.022361662124... 

-.004723204758-.004281503682.001847646883... 

.000230385764-.00025 1963 189.000039347320]; 

end 

ifPar==20, 

f=  [.026670057901.188176800078.527201188932... 

.688459039454.281172343661-.249846424327... 

-.195946274377.127369340336.093057364604... 

-.071394147166-.029457536822.033212674059... 

.003606553567-.010733175483.001395351747... 

.001992405295-.000685856695-.000116466855... 

.000093588670-.000013264203]; 

end 

end 


if  strcnip(Type,’Symnilet’), 
if  Par==4, 

f=[-.107148901418-.041910965 125.703739068656... 

1.136658243408.421234534204-.140317624179... 

-.017824701442.045570345896]; 

end 

if  Pai==5, 

f  =  [.038654795955.041746864422-.0553441861 17... 

.2819906968541.023052966894.896581648380... 

.023478923136-.247951362613-.029842499869... 

.027632152958]; 

end 

if  Pai=6, 

f=[.021784700327.004936612372-.166863215412... 
-.068323121587.6944579729581.113892783926... 
.47790437 1333-.  1Q2724969862-.02978375 1299... 
.063250562660.002499922093-.01 1031867509]; 
end 

ifPar==7, 

f=[.003792658534-.001481225915-.017870431651... 

.043155452582.096014767936-.070078291222... 
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.024665659489.7581626019641.085782709814... 

.408183939725-.198056706807-.152463871896... 

.005671342686.014521394762]; 

end 

if  Par==8, 

f  =  [.002672793393-.000428394300-.021 145686528... 

.005386388754.069490465911-.038493521263... 

-.073462508761.5153986703741.099106630537... 

.680745347190-.086653615406-.202648655286... 

.0107586 1 175 1.044823623042-.000766690896... 

-.004783458512]; 

end 

if  Par==9, 

f  =  [.0015 12487309-.000669141509-.014515578553... 

.012528896242.087791251554-.025786445930... 

-.270893783503.049882830959.873048407349... 

1.015259790832.337658923602-.077172161097... 

.000825 140929.042744433602-.01630335 1226... 

-.018769396836.000876502539.001981193736]; 

end 

if  Pai==10, 

f  =  [.001089170447.000135245020-.012220642630... 

-.002072363923.064950924579.016418869426... 

-.225558972234-.100240215031.667071338154... 

1.088251530500.542813011213-.05025654(X)92... 

-.045240772218.070703567550.008152816799... 

-.02878623 1926-.001 1375353 14.006495728375... 

.000080661204-.000649589896]; 

end 

end 

if  sttcmp(Type,’Vaidyanathan’), 

f  =  [-.0000629061 18.000343631905-.000453956620... 

-.000944897136.002843834547.000708137504... 

-.008839103409.003153847056.019687215010... 

-.014853448005-.035470398607.038742619293... 

.055892523691-.077709750902-.083928884366... 

.131971661417.135084227 129-.194450471766... 

-.263494802488.201612161775.635601059872... 

.57279779321 1.250184129505.0457993341 11]; 

end 

f  =  f  ./nonn(f); 


% 

%  Copyright  (c)  1993-5.  Jonathan  Buckheit  and  David  Donoho 
% 


% 

%  Part  of  WaveLab  Version  .700 
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%  Built  Friday.  December  8, 1995  8:36:37  PM 
%  This  is  Copyrighted  Material 
%  For  Copying  permissions  see  COPYINGm 
%  Comments?  e-mail  wavelab@playfair.stanford.edu 
% 

MirrorFilt-m 
function  y  =  MirrorFilt(x) 

%  MirrorFilt  --  Apply  (-ly^t  modulation 
%  Usage 

%  h  =  MirrorFilt(l) 

%  Inputs 
%  1  1-d  signal 
%  Ouq)uts 

%  h  1  -d  signal  with  DC  frequency  content  shifted 
%  to  Nyquist  frequency 
% 

%  Description 

%  h(t)  =  (-l)^(t-l)  *  x(t),  1  <=  t  <=  length(x) 

% 

%  See  Also 
%  DyadDownHi 
% 

y  =  -( (-l).'^(l:length(x))  ).*x; 

% 

%  Copyright  (c)  1993.  Iain  M.  Johnstone 
% 


% 

%  Part  of  WaveLab  Version  .700 
%  Built  Friday,  December  8, 1995  8:36:37  PM 
%  This  is  Copyrighted  Material 
%  For  Copying  permissions  see  COPYING.m 
%  Comments?  e-mail  wavelab@playfair.stanford.edu 
% 


planetestm 
clear  all; 

loadpIaneref;%Load  reference  images 
loadplanetest;%Load  test  images 

Pcorrplanel  =  zeros(5,3);%Initialize  overall  recognition  performance  output  variable 

for  trials  =  l:12;%Run  for  12  trials 

Pcorr  =  zeros(5,3);%Initialize  intermediate  output  variable 

m  =  2eros(10,10):%Initialize  Euclidean  distance  matrix 

b  =  [4  12  20];%Run  for  all  filter  lengths 

biterr  =  [.05  .02 .01 .001 .0001]:%Run  for  all  bit  error  rates 
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%%%%%%%%%%%%%%%%%%%%%%%%% 

%Mai]i  program  -  Compute  Euclidean  Distance  between  test  and  reference 

%image  feature  vectors  using  wavelet  coefficients  as  features 

%%%%%%%%%%%%%%%%%%%%%%%%% 

foru2=  l;length(biterr); 

thisbiterr  =  biterr(u2); 

foru=  l:lengtli(b): 

count2  =  l;%Initiali2e  test  image  index 
y  =  zeros(l,10):%Stoie  identified  plane  here 

qmf  =  MakeONFilter(‘Daubechies’,b(u));%Create  Daubechies  wavelet  filter  of  proper  length 
fork=  1:3; 
for  j  =  1:4; 

%Load  test  image 

if  exist([‘teslplane’  num2str(k)  num2str(j)])  ==  0; 
break; 

else  t  =  eval([‘testplane’  num2str(k)  nimi2str(i)]); 
count  =  1; 
forp=  1:3; 

%Load  reference  image 
forq=  1:4; 
ifp==3«&q>2; 
break; 

else  r  =  eval([‘refplai3e’  num2str(p)  num2str(q)]); 

r  =  FWT2_PO(r,l,qmf);%Compute  wavelet  transform  of 
%refeience  image 

r  =  reshape(r(l:128,l:128),l,128^2);%Store  coefficients 
%as  feature  vector 

if  count  =  1  ;%If  this  is  the  first  pass 
t  =  FWT2_PO(t,l,qmf);%Compu(e  wavelet 
^transform  of  test 
%image 

t  =  reshape(t(l:128,l:128),l,128'^2);%Store  coefficients 
%as  feature  vector 

t  =  sig2bin(t,8,’uniform’,’gray’):%Simulate  digital 
%transmission 

%Introduce  bit  errors  using  a  random  number  generator 
bitchange  =  find(rand(14ength(t))  <  thisbiterr); 
noise  =  zeros(size(t)); 
noise(bitdiange)  =  ones(size(bitchange)); 

%Coirupt  transmitted  vector  with  random  bit  errors 
t  =  xor(t4ioise); 

t  =  bin2sig(t,8,’uniform’,’gray’);%Decode  the  signal 
end; 

m(count2,count)  =  (r-t)*(r-t)’;%Compute  Euclidean  Distance 
count  =  count  -f  l;%Increment  reference  image  index 
end; 
end; 
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end; 

end; 

[n  j]  =  niin(m(count2,:));%Detennine  index  of  reference  image  with  smallest  distance  measure 

y(count2)  =  j;%Convert  to  ASCII  character 

count!  =  count2+l;%Increment  test  image  index 

end; 

end; 

ideal  =  l;10;%Correct  answer 

Pcorr(u2,u)  =  1  -  length(find(y  -  ideal))/length(ideal);%Compute  observed  probability  of  correct 
recognition 


end; 

end; 

Pccffrplanel  =  Pcorrplanel  +  Pcorr;%Sum  over  aU  trials 
save  Pcorrplanel  Pcorrplanel ;%save  output  variable 
end; 
end; 

quadlength.m 

Action  [n  J]  =  quadlength(x) 

%  quadlength  —  Find  length  and  dyadic  length  of  square  matrix 
%  Usage 

%  [n  J]  =  quadlength(x) 

%  Inputs 

%  X  2-d  image;  size(n4i),  n  =  2^J  (hq)efully) 

%  Outputs 
%  n  length(x) 

%  J  least  power  of  two  greater  than  n 
% 

%  Side  Effects 

%  A  warning  message  is  issue  if  n  is  not  a  power  of  2, 

%  or  if  X  is  not  a  square  matrix. 

% 

s  =  size(x); 
n  =  s(l); 
ifs(2)~=s(l), 

disp(‘Waming  m  quadlength:  nr  !=  nc’) 
end 

k  =  1 ;  J  =  0;  while  k  <  n ,  k=2*k;  J  =  1+J ;  end ; 
ifk-=n, 

disp(‘Wammg  in  quadlength:  n  !=  2'^J’) 
end 

% 

%  Copyright  (c)  1993.  David  L.  Dondio 
% 


% 

%  Part  of  WaveLab  Version  .700 
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%  Built  Friday,  December  8, 1995  8:36:37  PM 
%  This  is  Cq)yrighted  Material 
%  For  Copying  permissions  see  COPYlNG.m 
%  Comments?  e-mail  wavelab@playfair.stanford.edu 
% 


transtestm 

%%%%%%%%%%%%%%%%%%%% 

%Determine  performance  of  multiresolution  image  recognition  scheme 
%using  Fourier  transform  coefficients  as  elements  of  feature  vectors 
%%%%%%%%%%%%%%%%%%%% 
clear  all; 

loadgrayref;%load  reference  images 
loadgraytranstest;%load  test  images 
%Imtialize  global  variables  for  simulation 
global  START_OK; 

START_0K=1; 

Pgraytransl=  zeros(5,3);%Initialize  overall  performance  output  variable 
for  trials  =  l:12;%run  for  12  trials 

Pcorr  =  zeros(5,3);%Initialize  intermediate  ou^rut  variable  -  percentage 

%of  correctly  identified  images  for  single  trial 

m  =  zeros(46,36);%Initialize  Euclidean  distance  measure  matrix 

b  =  [4  12  20];%Length  of  Daubechies  filter 

biterr  =  [.05  .02 .01 .001 .0001];%Probability  of  bit  error 


%%%%%%%%%%%%%%%%%%% 

%Main  program:  run  simulation  for  all  filter  lengths  and  bit  errors; 

%%%%%%%%%%%%%%%%%%% 

for  u2  =  l:length(biterr);%Run  for  all  bit  errors 

thisbiterr  =  biterr(u2);%P(b)  for  this  trial 

for  u  =  l:length(b);%Rim  for  all  filter  lengths 

count2  =  l;%Test  image  index 

y  5=  zeros(l,46);%Store  text  character  identified  in  this  vector 

qmf  =  MakeONFilter(‘Daubechies’,b(u));%Build  Daubechies  wavelet  filter  of  given  length 

for  k  =  l:4;%Load  test  image 
forj=l:13; 

if  exist([‘transtestgray’  num2str(k)  num2str(j)  ‘.mat’])  ==  0; 
break; 

else  eval([‘load  transtestgray’  num2str(k)  num2str(j)]); 
t  =  eval([‘transtestgray’  num2str(k)  num2str(j)]); 

%Compare  with  all  reference  images 
count  =  l;%refeience  image  index 
forp=  1:5; 
forq=  1:8; 
if  p  ==  5  &  q  >  4; 
break; 
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else  r  =  eval([‘grayref  ’  num2str(p)  iiuiQ2str(q)]); 
r  =  FWT2_PO(rJ,qnif);%Compute  wavelet  transform  of  reference 
%image 

r  =  fftid(r);%Compute  feature  vector  by  taking 
%2  -  dimensional  Fourier  transform 
%of  each  subband  of  wavelet  transform 
if  coimt  —  l;%If  this  is  the  first  reference  image 

t  =  FWT2_PO(t,l,qmf);%Compute  wavelet  transform 
%of  test  image 

t  =  fftid(t);%Compute  feature  vector  by  taking 
%2  -  dimensional  Fourier  transform 
%of  each  subband  of  wavelet  transform 

t  =  quantize(t,8/scale’);%Quantize  uniformly  using 
%8  bits  per  coefficient 

t  =  bin_enc(h8);%Eiicode  using  natural  binary  code 
t  =  bin2gray(t);%Convert  to  gray  coding 

%Introduce  random  errors  using  random  number  generator  with 
%probability  =  thisbiterr 

bitchange  =  find(rand(size(t))  <  thisbiterr); 
noise  =  zeros(size(t)); 
bitchange  =  find(rand(size(t))  <  thisbiterr); 
noise(bitchange)  =  ones(size(bitchange)); 

%Corrupt  transmitted  signal  t  with  random  bit  errors  found  above 
t  =  xor(taioise); 

t  =  gray2bin(t);%convert  to  natural  binary  code 
t  =  bin_dec(t);%decode  signal 
end; 

m(count2,count)  =  (r-t)*(r-t)’;%Compute  distance  between  reference 
%and  test  vectors 

count  =  count  +  1; 
end; 
end; 
end; 
end; 

[n  j]  =  min(m(count2,:));%Detennine  index  of  reference  image  with  mirtimiiTn 
%Euclidean  distance 

y(count2)  =  nuin21et(j);%Convert  to  ASCII  character 

count2  =  count2+l;%Increment  test  vector  index 

end; 

end; 

%Compare  results  obtained  this  trial  with  correct  answer 
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ideal  =  [‘2’  ‘0’  ‘9’  ‘5’  ‘6’  ‘Q’  ‘U’  T  ‘C’  ‘K’  ‘F’  ‘O’  ‘X’  ‘E’  ‘S’  ‘J’  ‘U’  ‘M’  ‘F  ‘E’  ‘D’  ‘O’  ‘V’  ‘E’ 
‘R,  .p  .jj,  .j,  .3,  C4,  .g,  .7,  p,  <2’  ‘Y’  ‘B’  ‘R’  ‘O’  ‘W’  ‘N’  ‘D’  ‘O’  ‘G’  ‘S’]; 

Pcorr(u2,u)  =  1  -  length(fmd(y  -  ideal))/length(ideal);%Deteimine  error  performance 


end; 

end; 

Pgraytransl  =  Pgraytransl  +  Pcorr;%Determine  overall  performance  as  sum  of  all  trials 

savePgraytransl  Pgraytransl;%Store  overall  performance 

end; 

wavetest.m 

%%%%%%%%%%%%%%%%%%% 

%Test  the  performance  of  mriltiresolution  image  recognition  over  noisy  channel 
%using  wavelet  coefficients  as  feature  vectors 
%%%%%%%%%%%%%%%%%%% 
clear  aU; 

global  START_OK;%Set  global  variable  for  transmission  channel  simulation 
START_OK=l; 

loadgrayref;%load  reference  images 
loadgraytest;%load  test  images 

Pcorrla  =  zeros(5,3);%mitialize  ouQjut  variable  -  Probability  of  correct  recognition  overall 


for  trials  =  l;12;%Perform  test  12  times 

Pcorr  =  zeros(5,3);%lhtermediate  output  variable  -  Probability  of  correct  recognition  this  trial 

m  =  zeros(46,36);%Euclidian  Distance  measure  matrix 

b  =  [4  12  20];%Laigth  of  Daubechies  filters  to  test 

biterr  =  [  .05  .02 .01 .001 .0001];%Probability  of  bit  errors  to  test 


%%%%%% 

%Mam  part  of  prr^am:  determine  recognition  performance  for  each  bit  error  length  and 
%each  filter  length  at  this  resolution  level  (Rl) 

for  u2  =  l;length(biterr);%Run  for  all  bit  errors 

thisbiterr  =  biterT(u2);%P(b)  this  trial 

for  u  =  l;lengthG>);%Run  for  all  filter  lengths 

count2  =  l;%Index  of  test  character 

y  =  zeros(l,46);%Store  text  character  identified  in  this  vector 

qmf  =  MakeONFilter(‘Daubechies’,b(u));%Create  Daubechies  filter  of  correct  length  for  this  trial 

for  k  =  l:5;%load  next  test  character 
for  j  =  1:13; 

if  exist([‘graytest’  num2str(k)  num2str(j)])  ==  0; 
break; 

else  t  =  eval([‘graytest’  mim2strQk)  num2str(j)]); 

count  =  1;  %Compute  Euclidean  distance  to  all  reference  characters 
forp=  1:5; 
forq=  1:8; 
ifp==5&q>4; 
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break; 

else  r  =  eval([‘grayref  nurQ2str(p)  nuin2str(q)]); 

r  =  PWT2_PO(r,l,qmf);%Compute  wavelet  decompostion  using 
%wavelet  filter  qmf 

r  =  reshape(r(l:128a:128)aa28^2);%Reshape  all 
%coefficients  into 
%vector 

if  count  =  1  ;%If  tliis  is  the  first  reference  character, 

%coinpute  the  wavelet  transform  of  the 
%test  image  and 
%simulate  tramnitting  it  over  a 
%noisy  channel 

t  =  FWT2_PO(t,l,qmf);%Compute  wavelet 
%decomposition  using  qmf 
t  =  reshape(t(l:128,l:128),l,128^2);%Put  into  vector 
t  =  quantize(t,8,’scale’);%Quantize  unifonnly  using 
%8  bits  per  coefficient 

t  =  bm__enc(t,8);%Use  natural  binary  encoding 
t  =  bm2gray(t);%convert  to  gray  coding 

%Introduce  bit  eixors  randomly  with  probability 
%equal  to  thisbiterr  using  random  number 
%generator 

bitchange  =  find(rand(l,length(t))  <  thisbiterr); 
noise  =  zeros(size(t)); 
noiseCbitchange)  =  ones(size(bitchange)); 

%CoiTupt  t  with  random  errors  from  abover 
t  =  xoiftmoise); 

%Recover  corrupted  signal 
t  =  gray2bin(t);%convert  to  natural  binary 
1 5=  bin_dec(t);%Decode  signal 
end; 

m(count2,count)  =  (r-t)*(r-t)’;%Compute  EucHdean  dSistance 
count  =  count  +  l;%Increment  reference  image,  index 
end; 
end; 
end; 
end; 

bjl  ~  niin(m(count2,:));%Detennine  index  of  smallest  Euclidean  distance 


y(count2)  =  num21et(j);%Convert  to  ASCII  letter 
count2  =  count2+l;%Iacrement  test  imagp  inH^v 
end; 
end; 


%%Compare  results  with  conect  answer 

ideal  =  [‘2’  ‘0’  ‘9’  ‘5’  ‘6’  ‘Q’  ‘U’  T  ‘C’  ‘K’  ‘F  ‘O’  ‘X’  ‘E’  ‘S’  ‘J’  ‘U’  ‘M’  ‘F  ‘F  ‘D’  ‘O’  ‘V’  ‘F 
R  ‘T’  ‘H’  ‘F  ‘1’  ‘3’  ‘4’  ‘8’  ‘7’  ‘L’  ‘A’  ‘Z’  ‘Y’  ‘B’  ‘R’  ‘O’  ‘W’  ‘N’  ‘D’  ‘O’  ‘G’  ‘S’]; 
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Pcorr(ii2,u)  -  1  -  lfingth(find(y  -  ideaI))/length(ideal);%Detemiine  error  performance 


end; 

end; 


Pcoirla  -  Pconrla  +  PcoiT%Detennine  overall  performance  as  sum  of  individual  trials 
trials 


save  Pcorrla  PcoiTla;%Save  output 

end; 

end; 


76 


LIST  OF  REFERENCES 


1.  Gonzalez,  Rafael  C.,  and  Woods,  Richard  E.,  Digital  Image  Processing, 
Addison  Wesley  Publishing  Company,  Inc.,  Reading,  MA,  1992. 

2.  Pratt,  William  K.,  Digital  Image  Processing,  John  Wiley  &  Sons,  New  York, 
NY,  1978. 

3.  Castleman,  Kenneth  R.,  Digital  Image  Processing,  Prentice  Hall,  Inc., 
Englewood  Cliffs,  NJ,  1996. 

4.  Cohen,  Albert,  and  Kovacevic,  Jelena,  “Wavelets:  The  Mathematical 
Background.”  Proceedings  of  the  IEEE.  Vol.  84,  No.  4,  April  1996. 

5.  StoUnitz,  Eric  J.  Derose,  Tony  D.,  and  Salesin,  David  H.,  “Wavelets  for 
Computer  Graphics:  A  Primer.”  IEEE  Computer  Graphics  and  Applications, 
May  1995. 

6.  Burrus,  C.  S.,  and  Gopinath,  R.  A.,  Introduction  to  Wavelets  and  Wavelet 
Transforms.  Rice  University,  Houston,  TX,  1993. 

7.  Kaiser,  Gerald,  A  Friendly  Guide  to  Wavelets.  Birkhauser,  Boston,  1994. 

8.  Proakis,  John  G.,  and  Manolakis,  Dimitris  G.,  Digital  Signal  Processing, 
Macmillan  Publishing  Company.  New  York,  NY,  1992. 

9.  Carvalho,  Ronald,  Mulitresolution  Image  Compression  Using  Subband 
Coding  and  Wavelet  Decompositon,  Engineer’s  Thesis,  Naval  Postgraduate 
School,  Monterey,  December,  1994. 

10.  Leon,  Steven  J.,  Linear  Algebra  With  Applications,  Macmillan  College 
Publishing  Company,  New  York,  NY,  1994. 

11.  Akansu,  Ali  N.,  and  Haddad,  Richard  A.,  Multiresolution  Signal 
Decomposition,  Academic  Press,  Inc.,  Boston,  MA,  1992. 


77 


78 


INITIAL  DISTRIBUTION  LIST 


Defense  Technical  Information  Center 
8725  John  J.  Kingman  Rd.,  STE  0944, 

Ft.  Belvoir,  VA  22060-6218 

Dudley  Knox  Library 
Naval  Postgraduate  School 
41 1  Dyer  Rd. 

Monterey,  CA  93943-5101 
Chairman,  Code  EC 

Department  of  Electrical  and  Computer  Engineering 
Naval  Postgraduate  School 
Monterey,  CA  93943-5121 

Prof.  Murali  Tummala,  Code  EC/Tu 
Department  of  Electrical  and  Computer  Engineering 
Naval  Postgraduate  School 
Monterey,  C A  93943-5121 

Prof.  Ralph  Hippenstiel,  Code  EC/Hi 
Department  of  Electrical  and  Computer  Engineering 
Naval  Postgraduate  School 
Monterey,  CA  93943-5121 

LCDR  Gregory  H.  Skinner 

Officer-In-Charge 

Patrol  Wings  Atlantic  Detachment 

Advanced  Maritime  Projects  Office 

Jacksonville,  FL  32212-0051 

LT  William  M.  Peyton,  Jr. 

1848  Cherokee  Dr.  #6. 

Salinas,  CA  93906 


